What is difference between moving location counter in a linker script and making empty space - linux-kernel

I have a question about linker script "vmlinux.lds.S" in the Linux kernel, v4.6. Since its code is quite long, I post a link for it: https://android.googlesource.com/kernel/msm/+/android-wear-5.1.1_r0.6/arch/arm64/kernel/vmlinux.lds.S?autodive=0%2F%2F%2F%2F
As far as I know, moving location counter(.) in a linker script makes empty space. For instance, by saying ". += 1000" one gets 1000 bytes of empty space.
However, the linker script starts with:
. = PAGE_OFFSET + TEXT_OFFSET;
Here, page offset is a (virtual) start address of the kernel space.(i.e. 0xFF...FF00...00)
Ofcourse, assigning this large value to location counter doesn't mean that we get such big empty space. Only VMA is assigned, and LMA is someting like 0x8000.
Why this code doesn't generate large empty space? Is there any exceptional rule about moving location counter?
What determines actual LMA of the kernel code(starting with _stext section)?
Thank you.

Related

Question about VirtualQueryEx() lpAdress variable

I am building a Memory Scanner to find malware strings in a process.
Btw, when I was searching about the VirtualQueryEx dll, I saw that people starts its variable lpAdress ( which is supposed to be the Base Address of the process) with a NULL/0 value
LPVOID lpAdress = 0
and in each loop they increase the adress value by the size of the page they just read, so that way they go to the next page and can map all process virtual memory
lpAdress += mbi.RegionSize # mbi is a variable with MEMORY_BASIC_INFORMATION structure
So, is lpAdress the value of memory considering 0 as a start of the own process virtual memory and you dont need to get the actual base adress of the process in memory ? Sorry if my question looks dumb, but the MSDN documentation is confusing me.
Each process has it's own virtual address space that starts at 0. The various executable files (.exe / .dll / whatever) are loaded either at addresses specified in the file or more recently at random addresses for security purposes.
A process can easily have mapped memory regions at addresses lower than where the process executable is loaded. For this reason, if you want to examine a process' entire memory space you need to start at 0.

How can i calculate the file offset of the memory virtual address of the export table?

so, i was trying to read a DLL file, everything was fine till i reach the Optional Header Data Directories, specifically its first member, the Export Table.
My problem is that i can't move the offset of my reader because the virtual address member is based on memory VA, and my reader is based on file offset. May a visual example helps:
As you can see, the loaded virtual address that this PE viewer reads at the Export Table Address from the Data Directory(Optional Header) is the value 0x00002630(lets refer to it as hex1 from now on).
However, when i click on the Export Table to see the actual content, the program does the conversion of this address from memory to file offset, redirecting me to this address as the result:
The address that it redirects me is the 0x00001a30(lets refer to it as hex2 from now on).
I did some tests on my own like dividing the hex1 per 8 because i thought it could be the transition from memory alignment which is 4096 and the file alignment which is 512 but it didn't gave me the same result as hex2. I also did some weird stuff to try to get that formula, but it gave me even more bizarre results.
So, my question would be, how could i get/calculate that file offset(hex2) if i only know the memory offset at the Data Directory(hex1)?
Assuming you are using MSVC C/C++, you first need to locate the array of IMAGE_SECTION_HEADER structures following the Optional Header. The SDK has a macro called IMAGE_FIRST_SECTION(pNtHeaders) in which you just pass the pointer of your PE header to make this process easier. It basically just skips past the optional header in memory which is where the section headers start. This macro will also work on either 32-bit or 64-bit Windows PE files.
Once you have the address of the IMAGE_SECTION_HEADER array, you loop through the structures up to FileHeader.NumberOfSections using pointer math. Each of the structures describe the relative starting of a memory address (VirtualAddress) for each of the named PE sections along with the file offset (PointerToRawData) to that section within the file you have loaded.
The size of the section WITHIN the file is SizeOfRawData. At this point, you now have everything you need to translate any given RVA to a file offset. First range check each IMAGE_SECTION_HEADER's VirtualAddress with the RVA you are looking up. I.e.:
if (uRva >= pSect->VirtualAddress && (uRva < (pSect->VirtualAddress + pSect->SizeOfRawData))
{
//found
}
If you find a matching section, you then subtract the VirtualAddress from your lookup RVA, then add the PointerToRawData offset:
uFileOffset = uRva - pSect->VirtualAddress + pSect->PointerToRawData
This results in an offset from the beginning of the file corresponding to that RVA. At this point you have translated the RVA to a file offset.
NOTE: Due to padding, incorrect PE files, etc., you may find not all RVAs will map to a location within the file at which point you might display an error message.

How do I make space for my code cave in a Windows PE 32bit executable

So I want to make a space for my code caves in minesweeper.exe (typical Windows XP minesweeper game, link: Minesweeper). So I modified the PE header of the file via CFF Explorer to increase size of the .text section.
I tried increasing raw size of .text segment by 1000h (new size was 3B58), but Windows was unable to locate the entry point and the game failed to launch. Then I tried increasing the size of the .rsrc section, adding a new section, increasing the image size, but none of those attempts were successful, Windows was saying that "This is not x32 executable".
So here is the question: how do I make space for my code cave? I don't want to search for empty space left by the compiler, I want to have nice and clean 1000h bytes for my code. A tutorial for that and a detailed explanation for how to do that without corrupting a game would be GREAT! (And yes, I am actually hacking a minesweeper)
You can't increase the size of a section without invalidating the following ones (typically because it invalidates offsets and addresses in those sections). This remains possible but it's extremely error prone and doesn't worth the hassle when you have a simpler solution.
Typically, you juste need to add a section at the end of the PE and jump there from the code section. There is usually a little bit of space at the end of the code section (code cave) so you can place your JMPs (or a little code stub) there to redirect to the new section. You can also add other new sections for data or new resources or whatever you want.
Note: I'm using two tools: CFF explorer as a PE browser; an hex editor.
This file is quite particular so it is a little bit harder than usual to add a new section.
Let's start!
Below is an hex view of the array of IMAGE_SECTION_HEADER:
Usually there is some room to add a new section but in this particular case, there's none... The last section header is immediately followed by something.
Judging by the content, this is probably a bound import directory, which is confirmed in CFF explorer (offset of the bound directory is 0x248):
Bound import directory are of no use today, especially with ASLR, so we can zero out the whole directory (its size is 0xA8 bytes as indicated in the previous screenshot):
You can also zero out the Bound Import directory RVA in the Data Directories although this is not strictly required:
Now, it's time to add the new section.
Add a new section
Minesweeper comes with 3 sections by default, so increment the Number of sections from 3 to 4:
Go to the sections headers and add a new section (you can do it directly in CFF explorer; I named mine, .foobar, be wary that section names are at most 8 characters and don't need to end with a NULL byte):
You need to choose two numbers:
The raw size of the new section (I picked 0x400) ; it must be a multiple of FileAlignment (which is 0x200 in this case).
The virtual size of the new section (I picked 0x1000); it must be a multiple of SectionAlignement (which is 0x1000 for this binary).
Now we" need to calculate the two other members, Virtual Address and Raw Address.
Virtual Address
Let's take an example with the first and second section.
The first section starts at virtual address 0x1000 and has a virtual Size of 0x3A56. The next section virtual address must be aligned on SectionAlignement (0x1000) so the calculation is (using python here):
>>> def round_up_multiple_of(number, multiple):
num = number + (multiple - 1)
return num - (num % multiple)
>>> hex(round_up_multiple_of(0x1000 + 0x3a56, 0x1000))
'0x5000'
Which gives 0x5000 which is right (.data section starts at virtual address 0x5000).
Now, where our last section should start?
.rsrc section starts at 0x6000 and has a size of 0x19160:
>>> hex(round_up_multiple_of(0x6000 + 0x19160, 0x1000))
'0x20000'
So it must start at virtual address 0x20000. Put that number in Virtual Address.
Raw address
(Typically this is not needed as all sections are already aligned the last section must start right at the end of the file, but we'll do it).
As a reminder, the raw address is an address in the file (not in memory).
Let's start with an example (first and second section):
The first section raw address is 0x400 and its raw size 0x3c00. FileAlignement is 0x200, thus:
>>> hex(round_up_multiple_of(0x400 + 0x3c00, 0x200))
'0x4000'
The second section should start on the file (its Raw address) at 0x4000 which is right.
Thus for our new section, the calculation is:
.rsrc section starts in the file at 0x4200
.rsrc section size on file is 0x19200
FileAligment is 0x200
The calculation is as follow:
>>> hex(round_up_multiple_of(0x4200 + 0x19200, 0x200))
'0x1d400'
Our last section starts at the raw address 0x1d400 in the file which is confirmed with an hex editor:
Final steps
One last step is required, the calculation of the SizeOfImage field in the Optional header. According to the PE specification the field is:
The size (in bytes) of the image, including all headers, as the image
is loaded in memory. It must be a multiple of SectionAlignment.
Hence the calculation can be simplified as: VirtualAddress + VirtualSize of the last section, aligned on SectionAlignment (0x1000):
>>> hex(round_up_multiple_of(0x20000 + 0x1000, 0x1000))
'0x21000'
Now, save all your modifications in CFF explorer and exit.
Adding room for the new section
The last step is to add the required bytes for the last section. As I choose a Raw size of 0x400, I insert 0x400 bytes at Raw Address (0x1d400) with an hex editor.
Save you file. If you followed all the steps it must work (tested on Win 10) as is and you can start the modified executable without any errors.
Try to experience with a different raw size for the new section if 0x400 is not enough.
Now you have a new empty section, the rest is up to you for modifying the code :)

GDB find command error "warning: Unable to access x bytes of target memory at y, halting search"

I'm trying to find current flag count in KMines by using gdb. I know that I should look for memory mappings first to avoid non-existent memory locations. So I ran info proc mappings command to see the memory segments. I picked up a random memory gap (0xd27000-0x168b000) from the result and executed the find command like this: find 0x00d27000, 0x0168b000, 10
But I got the warning: Unable to access 1458 bytes of target memory at 0x168aa4f, halting search. error. Although the address 0x168aa4f is between 0xd27000 and 0x168b000, gdb says that it can't access to it. Why does this happen? What can I do to avoid this situation? Or is there a way to ignore unmapped/unaccessible memory locations?
Edit: I tried to set the value of the address 0x168aa4f to 1 and it works, so gdb can actually access that address but gives error when used with the find command. But why?
I guess I have solved my own problem, I can't believe how simple the solution was. The only thing I did was to decrease the 2nd parameter's value by one. So the code should be find 0x00d27000, 0x0168afff, 10 because linux allocates the memory by using maps in [x,y) format, so if the line in root/proc/pid/maps says something like this;
01a03000-0222a000 rw-p
The memory allocated includes 0x01a03000 but not 0x0222a000. Hope this silly mistake of mine helps someone :D
Edit: The root of the problem is the algorithm implemented in target.c(gdb's source code I mean) the algorithm reads and searches the memory as chunks at the size of 16000 bytes. So even if the last byte of the chunk is invalid, gdb will throw the entire chunk into the trash and won't even give any proper information about the invalid byte, it only reports the beginning of the current chunk.

x86 segmentation, DOS, MZ file format, and disassembling

I'm disassembling "Test Drive III". It's a 1990 DOS game. The *.EXE has MZ format.
I've never dealt with segmentation or DOS, so I would be grateful if you answered some of my questions.
1) The game's system requirements mention 286 CPU, which has protected mode. As far as I know, DOS was 90% real mode software, yet some applications could enter protected mode. Can I be sure that the app uses the CPU in real mode only? IOW, is it guaranteed that the segment registers contain actual offset of the segment instead of an index to segment descriptor?
2) Said system requirements mention 1 MB of RAM. How is this amount of RAM even meant to be accessed if the uppermost 384 KB of the address space are reserved for stuff like MMIO and ROM? I've heard about UMBs (using holes in UMA to access RAM) and about HMA, but it still doesn't allow to access the whole 1 MB of physical RAM. So, was precious RAM just wasted because its physical address happened to be reserved for UMA? Or maybe the game uses some crutches like LIM EMS or XMS?
3) Is CS incremented automatically when the code crosses segment boundaries? Say, the IP reaches 0xFFFF, and what then? Does CS switch to the next segment before next instruction is executed? Same goes for SS. What happens when SP goes all the way down to 0x0000?
4) The MZ header of the executable looks like this:
signature 23117 "0x5a4d"
bytes_in_last_block 117
blocks_in_file 270
num_relocs 0
header_paragraphs 32
min_extra_paragraphs 3349
max_extra_paragraphs 65535
ss 11422
sp 128
checksum 0
ip 16
cs 8385
reloc_table_offset 30
overlay_number 0
Why does it have no relocation information? How is it even meant to run without address fixups? Or is it built as completely position-independent code consisting from program-counter-relative instructions? The game comes with a cheat utility which is also an MZ executable. Despite being much smaller (8448 bytes - so small that it fits into a single segment), it still has relocation information:
offset 1
segment 0
offset 222
segment 0
offset 272
segment 0
This allows IDA to properly disassemble the cheat's code. But the game EXE has nothing, even though it clearly has lots of far pointers.
5) Is there even such thing as 'sections' in DOS? I mean, data section, code (text) section etc? The MZ header points to the stack section, but it has no information about data section. Is data and code completely mixed in DOS programs?
6) Why even having a stack section in EXE file at all? It has nothing but zeroes. Why wasting disk space instead of just saying, "start stack from here"? Like it is done with BSS section?
7) MZ header contains information about initial values of SS and CS. What about DS? What's its initial value?
8) What does an MZ executable have after the exe data? The cheat utility has whole 3507 bytes in the end of the executable file which look like
__exitclean.__exit.__restorezero._abort.DGROUP#.__MMODEL._main._access.
_atexit._close._exit._fclose._fflush._flushall._fopen._freopen._fdopen
._fseek._ftell._printf.__fputc._fputc._fputchar.__FPUTN.__setupio._setvbuf
._tell.__MKNAME._tmpnam._write.__xfclose.__xfflush.___brk.___sbrk._brk._sbrk
.__chmod.__close._ioctl.__IOERROR._isatty._lseek.__LONGTOA._itoa._ultoa.
_ltoa._memcpy._open.__open._strcat._unlink.__VPRINTER.__write._free._malloc
._realloc.__REALCVT.DATASEG#.__Int0Vector.__Int4Vector.__Int5Vector.
__Int6Vector.__C0argc.__C0argv.__C0environ.__envLng.__envseg.__envSize
Is this some kind of debugging symbol information?
Thank you in advance for your help.
Re. 1. No, you can't be sure until you prove otherwise to yourself. One giveaway would be the presence of MOV CR0, ... in the code.
Re. 2. While marketing materials aren't to be confused with an engineering specification, there's a technical reason for this. A 286 CPU could address more than 1M of physical address space. The RAM was only "wasted" in real mode, and only if an EMM (or EMS) driver wasn't used. On 286 systems, the RAM past 640kb was usually "pushed up" to start at the 1088kb mark. The ISA and on-board peripherals' memory address space was mapped 1:1 into the 640-1024kb window. To use the RAM from the real mode needed an EMM or EMS driver. From protected mode, it was simply "there" as soon as you set up the segment descriptor correctly.
If the game actually needed the extra 384kb of RAM over the 640kb available in the real mode, it's a strong indication that it either switched to protected mode or required the services or an EMM or EMS driver.
Re. 3. I wish I remembered that. On reflection, I wish not :) Someone else please edit or answer separately. Hah, I did know it at some point in time :)
Re. 4. You say "[the code] has lots of instructions like call far ptr 18DCh:78Ch". This implies one of three things:
Protected mode is used and the segment part of the address is a selector into the segment descriptor table.
There is code there that relocates those instructions without DOS having to do it.
There is code there that forcibly relocates the game to a constant position in the address space. If the game doesn't use DOS to access on-disk files, it can remove DOS completely and take over, gaining lots of memory in the process. I don't recall whether you could exit from the game back to the command prompt. Some games where "play until you reboot".
Re. 5. The .EXE header does not "point" to any stack, there is no stack section you imply, the concept of sections doesn't exist as far as the .EXE file is concerned. The SS register value is obtained by adding the segment the executable was loaded at with the SS value from the header.
It's true that the linker can arrange sections contiguously in the .EXE file, but such sections' properties are not included in the .EXE header. They often can be reverse-engineered by inspecting the executable.
Re. 6. The SS and SP values in the .EXE header are not file pointers. The EXE file might have a part that maps to the stack, but that's entirely optional.
Re. 7. This is already asked and answered here.
Re. 8. This looks like a debug symbol list. The cheat utility was linked with the debugging information left in. You can have completely arbitrary data there - often it'd various resources (graphics, music, etc.).

Resources