Explicitly set starting stack pointer with linker script - gcc

I'd like to create a program with a special section at the end of Virtual Memory. So I wanted to do a linker script something like this:
/* ... */
.section_x 0xffff0000 : {
_start_section_x = .;
. = . + 0xffff;
_end_section_x = .;
}
The problem is that gcc/ld/glibc seem to load the stack at this location by default for a 32 bit application, even if it overlaps a known section. The above code zero's out the stack causing an exception. Is there any way to tell the linker to use another VM memory location for the stack? (As well, I'd like to ensure the heap doesn't span this section of virtual memory...).

I hate answers that presume or even ask if the question is wrong but, if you need a 64k segment, why can't you just allocate one on startup?
Why could you possibly need a fixed address within your process address space? I've been doing a lot of different kinds of coding for almost 30 years, and I haven't seen the need for a fixed address since the advent of protected memory.

Related

What is difference between moving location counter in a linker script and making empty space

I have a question about linker script "vmlinux.lds.S" in the Linux kernel, v4.6. Since its code is quite long, I post a link for it: https://android.googlesource.com/kernel/msm/+/android-wear-5.1.1_r0.6/arch/arm64/kernel/vmlinux.lds.S?autodive=0%2F%2F%2F%2F
As far as I know, moving location counter(.) in a linker script makes empty space. For instance, by saying ". += 1000" one gets 1000 bytes of empty space.
However, the linker script starts with:
. = PAGE_OFFSET + TEXT_OFFSET;
Here, page offset is a (virtual) start address of the kernel space.(i.e. 0xFF...FF00...00)
Ofcourse, assigning this large value to location counter doesn't mean that we get such big empty space. Only VMA is assigned, and LMA is someting like 0x8000.
Why this code doesn't generate large empty space? Is there any exceptional rule about moving location counter?
What determines actual LMA of the kernel code(starting with _stext section)?
Thank you.

accessing process memory parts

I'm currently studying memory management of OS by the video lecture. The instructor says,
In fact, you may have, and it is quite often the case that there may
be several parts of the process memory, which are not even accessed at
all. That is, they are neither executed, loaded or stored from memory.
I don't understand the saying since even if in a simple C program, we access whole address space of it. Don't we?
#include <stdio.h>
int main()
{
printf("Hello, World!");
return 0;
}
Could you elucidate the saying? If possible could you provide an example program wherein "several parts of the process memory, which are not even accessed at all" when it is run.
Imagine you have a large and complicated utility (e.g. a compiler), and the user asks it for help (e.g. they type gcc --help instead of asking it to compile anything). In this case, how much of the utility's code and data is used?
Most programs have various optional parts that aren't used (e.g. maybe something that works with graphics will have some code for 16 bits per pixel and other code for 32 bits per pixel, and will determine which code to use and not use the other code). Most heap allocators are "eager" (e.g. they'll ask the OS for 20 MiB of space and then might only "malloc() 2 MiB of it). Sometimes a program will memory map a huge file but then only access a small part of it.
Even for your trivial "hello world" example code; the virtual address space probably contains a huge (several MiB) shared library to support lots of C standard library functions (e.g. puts(), fprintf(), sprintf(), ...) and your program only uses a small part of that shared library; and your program probably reserves a conservative amount of space for its stack (e.g. maybe 20 KiB of space for its stack) and then probably only uses a few hundred bytes of stack.
In a virtual memory system, the address space of the process is created in secondary store at start up. Little or nothing gets placed in memory. For example, the operating system may use the executable file as the page file for the code and static data. It just sets up an internal structure that says some range of memory is mapped to these blocks in the executable file. The same goes for shared libraries. The other data gets mapped to the page file.
As your program runs it starts page faulting rapidly because nothing is in memory and the operating system has to load it from secondary storage.
If there is something that your program does not reference, it never gets loaded into memory.
If you had global variable declared like
char somedata [1045] ;
and your program never references that variable, it will never get loaded into memory. The same goes for code. If you have pages of code that done get execute (e.g. error handling code) it does not get loaded. If you link to shared libraries, you will likely bece including a lot of functions that you never use. Likewise, they will not get loaded if you do not execute them.
To begin with, not all of the address space is backed by physical memory at all times, especially if your address space covers 248+ bytes, which your computer doesn't have (which is not to say you can't map most of the address space to a single physical page of memory, which would be of very little utility for anything).
And then some portions of the address space may be purposefully permanently inaccessible, like a few pages near virtual address 0 (to catch NULL pointer dereferences).
And as it's been pointed out in the other answers, with on-demand loading of programs, you may have some portions of the address space reserved for your program but if the program doesn't happen to need any of its code or data there, nothing needs to be loader there either.

Catching and avoiding memory corruption at fixed offset in physical memory

We have a 4-byte memory corruption that always occurs at a fixed offset in the physical memory.
The physical frame number is 0x00a4d and the offset is ending with dc0.
Question 1) Based on this information, can we say the physical address of corruption is 0x00a4d * PAGE_SIZE (4096) + dc0 = 0x00A4DDC0. Programmatically, what is best way to confirm the physical address? Ours is ppc64 based system.
Question 2) What would be the best way to find out this memory corruption? The more I read the more I get lost with the plethora of options. Should I use KASAN, or CONFIG_DEBUG_PAGEALLOC (debug_guardpage_minorder) option or a HW breakpoint?
Question 3) Since we know the corruption is at a fixed option, if we were to reserve/block that page, what again is the best option? The two I came across are memmap and Reserved memory regions
Thanks
1.) You are right about physical address.
2.) HW breakpoint is the best if you have such possibility. Do you have the appropriate device (t32 or whatever) / debug port/ could it place HW break at physical address?
Here is the more generic and dumb case which needs no HW support:
If I remember right from your previous post, you suspect the kernel code as a corruption causer.
If you have read anything about KASAN, you probably mentioned that gcc part places hooks instead of kernel code loads and stores. The kernel part provides kasan_store_bla_bla_bla hook, which handles correctness of this store. Very likely, that default functionality wouldn't help you, but you can integrate your code in this kasan store hook, which would:
2.1)Take the virtual address passed to the store kasan hook
2.2)Finds appropriate physical address by page tables walking like this (the more convenient API exists but i don't remember the function name):
pgd_t *pgd = pgd_offset(mm, addr);
pud_t *pud = pud_offset(pgd, addr);
pmd_t *pmd = pmd_offset(pud, addr);
...
As i remember from your previous post you get crash in userspace app, so you will be need to check all processes mms from task list.
2.3) Compare found physical address to the given, and check that written value is zero (as i remember from your previous post)
2.4) If match print backtraces for all cores and stop execution.

x86 segmentation, DOS, MZ file format, and disassembling

I'm disassembling "Test Drive III". It's a 1990 DOS game. The *.EXE has MZ format.
I've never dealt with segmentation or DOS, so I would be grateful if you answered some of my questions.
1) The game's system requirements mention 286 CPU, which has protected mode. As far as I know, DOS was 90% real mode software, yet some applications could enter protected mode. Can I be sure that the app uses the CPU in real mode only? IOW, is it guaranteed that the segment registers contain actual offset of the segment instead of an index to segment descriptor?
2) Said system requirements mention 1 MB of RAM. How is this amount of RAM even meant to be accessed if the uppermost 384 KB of the address space are reserved for stuff like MMIO and ROM? I've heard about UMBs (using holes in UMA to access RAM) and about HMA, but it still doesn't allow to access the whole 1 MB of physical RAM. So, was precious RAM just wasted because its physical address happened to be reserved for UMA? Or maybe the game uses some crutches like LIM EMS or XMS?
3) Is CS incremented automatically when the code crosses segment boundaries? Say, the IP reaches 0xFFFF, and what then? Does CS switch to the next segment before next instruction is executed? Same goes for SS. What happens when SP goes all the way down to 0x0000?
4) The MZ header of the executable looks like this:
signature 23117 "0x5a4d"
bytes_in_last_block 117
blocks_in_file 270
num_relocs 0
header_paragraphs 32
min_extra_paragraphs 3349
max_extra_paragraphs 65535
ss 11422
sp 128
checksum 0
ip 16
cs 8385
reloc_table_offset 30
overlay_number 0
Why does it have no relocation information? How is it even meant to run without address fixups? Or is it built as completely position-independent code consisting from program-counter-relative instructions? The game comes with a cheat utility which is also an MZ executable. Despite being much smaller (8448 bytes - so small that it fits into a single segment), it still has relocation information:
offset 1
segment 0
offset 222
segment 0
offset 272
segment 0
This allows IDA to properly disassemble the cheat's code. But the game EXE has nothing, even though it clearly has lots of far pointers.
5) Is there even such thing as 'sections' in DOS? I mean, data section, code (text) section etc? The MZ header points to the stack section, but it has no information about data section. Is data and code completely mixed in DOS programs?
6) Why even having a stack section in EXE file at all? It has nothing but zeroes. Why wasting disk space instead of just saying, "start stack from here"? Like it is done with BSS section?
7) MZ header contains information about initial values of SS and CS. What about DS? What's its initial value?
8) What does an MZ executable have after the exe data? The cheat utility has whole 3507 bytes in the end of the executable file which look like
__exitclean.__exit.__restorezero._abort.DGROUP#.__MMODEL._main._access.
_atexit._close._exit._fclose._fflush._flushall._fopen._freopen._fdopen
._fseek._ftell._printf.__fputc._fputc._fputchar.__FPUTN.__setupio._setvbuf
._tell.__MKNAME._tmpnam._write.__xfclose.__xfflush.___brk.___sbrk._brk._sbrk
.__chmod.__close._ioctl.__IOERROR._isatty._lseek.__LONGTOA._itoa._ultoa.
_ltoa._memcpy._open.__open._strcat._unlink.__VPRINTER.__write._free._malloc
._realloc.__REALCVT.DATASEG#.__Int0Vector.__Int4Vector.__Int5Vector.
__Int6Vector.__C0argc.__C0argv.__C0environ.__envLng.__envseg.__envSize
Is this some kind of debugging symbol information?
Thank you in advance for your help.
Re. 1. No, you can't be sure until you prove otherwise to yourself. One giveaway would be the presence of MOV CR0, ... in the code.
Re. 2. While marketing materials aren't to be confused with an engineering specification, there's a technical reason for this. A 286 CPU could address more than 1M of physical address space. The RAM was only "wasted" in real mode, and only if an EMM (or EMS) driver wasn't used. On 286 systems, the RAM past 640kb was usually "pushed up" to start at the 1088kb mark. The ISA and on-board peripherals' memory address space was mapped 1:1 into the 640-1024kb window. To use the RAM from the real mode needed an EMM or EMS driver. From protected mode, it was simply "there" as soon as you set up the segment descriptor correctly.
If the game actually needed the extra 384kb of RAM over the 640kb available in the real mode, it's a strong indication that it either switched to protected mode or required the services or an EMM or EMS driver.
Re. 3. I wish I remembered that. On reflection, I wish not :) Someone else please edit or answer separately. Hah, I did know it at some point in time :)
Re. 4. You say "[the code] has lots of instructions like call far ptr 18DCh:78Ch". This implies one of three things:
Protected mode is used and the segment part of the address is a selector into the segment descriptor table.
There is code there that relocates those instructions without DOS having to do it.
There is code there that forcibly relocates the game to a constant position in the address space. If the game doesn't use DOS to access on-disk files, it can remove DOS completely and take over, gaining lots of memory in the process. I don't recall whether you could exit from the game back to the command prompt. Some games where "play until you reboot".
Re. 5. The .EXE header does not "point" to any stack, there is no stack section you imply, the concept of sections doesn't exist as far as the .EXE file is concerned. The SS register value is obtained by adding the segment the executable was loaded at with the SS value from the header.
It's true that the linker can arrange sections contiguously in the .EXE file, but such sections' properties are not included in the .EXE header. They often can be reverse-engineered by inspecting the executable.
Re. 6. The SS and SP values in the .EXE header are not file pointers. The EXE file might have a part that maps to the stack, but that's entirely optional.
Re. 7. This is already asked and answered here.
Re. 8. This looks like a debug symbol list. The cheat utility was linked with the debugging information left in. You can have completely arbitrary data there - often it'd various resources (graphics, music, etc.).

Investigating Memory Leak

We have a slow memory leak in our application and I've already gone through the following steps in trying to analyize the cause for the leak:
Enabling user mode stack trace database in GFlags
In Windbg, typing the following command: !heap -stat -h 1250000 (where 1250000 is the address of the heap that has the leak)
After comparing multiple dumps, I see that a memory blocks of size 0xC are increasing over time and are probably the memory that is leaked.
typing the following command: !heap -flt s c
gives the UserPtr of those allocations and finally:
typing !heap -p -a address on some of those addresses always shows the following allocation call stack:
0:000> !heap -p -a 10576ef8
address 10576ef8 found in
_HEAP # 1250000
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
10576ed0 000a 0000 [03] 10576ef8 0000c - (busy)
mscoreei!CLRRuntimeInfoImpl::`vftable'
7c94b244 ntdll!RtlAllocateHeapSlowly+0x00000044
7c919c0c ntdll!RtlAllocateHeap+0x00000e64
603b14a4 mscoreei!UtilExecutionEngine::ClrHeapAlloc+0x00000014
603b14cb mscoreei!ClrHeapAlloc+0x00000023
603b14f7 mscoreei!ClrAllocInProcessHeapBootstrap+0x0000002e
603b1614 mscoreei!operator new[]+0x0000002b
603d402b +0x0000005f
603d5142 mscoreei!GetThunkUseState+0x00000025
603d6fe8 mscoreei!_CorDllMain+0x00000056
79015012 mscoree!ShellShim__CorDllMain+0x000000ad
7c90118a ntdll!LdrpCallInitRoutine+0x00000014
7c919a6d ntdll!LdrpInitializeThread+0x000000c0
7c9198e6 ntdll!_LdrpInitialize+0x00000219
7c90e457 ntdll!KiUserApcDispatcher+0x00000007
This looks like thread initialization call stack but I need to know more than this.
What is the next step you would recommend to do in order to put the finger at the exact cause for the leak.
The stack recorded when using GFlags is done without utilizing .pdb and often not correct.
Since you have traced the leak down to a specific size on a given heap, you can try
to set a live break in RtlAllocateHeap and inspect the stack in windbg with proper symbols. I have used the following with some success. You must edit it to suit your heap and size.
$$ Display stack if heap handle eq 0x00310000 and size is 0x1303
$$ ====================================================================
bp ntdll!RtlAllocateHeap "j ((poi(#esp+4) = 0x00310000) & (poi(#esp+c) = 0x1303) )'k';'gc'"
Maybe you then get another stack and other ideas for the offender.
The first thing is that the new operator is the new [] operator so is there a corresponding delete[] call and not a plain old delete call?
If you suspect this code I would put a test harness around it, for instance put it in a loop and execute it 100 or 1000 times, does it still leak and proportionally.
You can also measure the memory increase using process explorer or programmatically using GetProcessInformation.
The other obvious thing is to see what happens when you comment out this function call, does the memory leak go away? You may need to do a binary chop if possible of the code to reduce the likely suspect code by half (roughly) each time by commenting out code, however, changing the behaviour of the code may cause more problems or dependant code path issues which can cause memory leaks or strange behaviour.
EDIT
Ignore the following seeing as you are working in a managed environment.
You may also consider using the STL or better yet boost reference counted pointers like shared_ptr or scoped_array for array structures to manage the lifetime of the objects.

Resources