Why does the stack address grow towards decreasing memory addresses? - memory-management

I read in text books that the stack grows by decreasing memory address; that is, from higher address to lower address. It may be a bad question, but I didn't get the concept right. Can you explain?

First, it's platform dependent. In some architectures, stack is allocated from the bottom of the address space and grows upwards.
Assuming an architecture like x86 that stack grown downwards from the top of address space, the idea is pretty simple:
=============== Highest Address (e.g. 0xFFFF)
| |
| STACK |
| |
|-------------| <- Stack Pointer (e.g. 0xEEEE)
| |
. ... .
| |
|-------------| <- Heap Pointer (e.g. 0x2222)
| |
| HEAP |
| |
=============== Lowest Address (e.g. 0x0000)
To grow stack, you'd decrease the stack pointer:
=============== Highest Address (e.g. 0xFFFF)
| |
| STACK |
| |
|.............| <- Old Stack Pointer (e.g. 0xEEEE)
| |
| Newly |
| allocated |
|-------------| <- New Stack Pointer (e.g. 0xAAAA)
. ... .
| |
|-------------| <- Heap Pointer (e.g. 0x2222)
| |
| HEAP |
| |
=============== Lowest Address (e.g. 0x0000)
As you can see, to grow stack, we have decreased the stack pointer from 0xEEEE to 0xAAAA, whereas to grow heap, you have to increase the heap pointer.
Obviously, this is a simplification of memory layout. The actual executable, data section, ... is also loaded in memory. Besides, threads have their own stack space.
You may ask, why should stack grow downwards. Well, as I said before, some architectures do the reverse, making heap grow downwards and stack grow upwards. It makes sense to put stack and heap on opposite sides as it prevents overlap and allows both areas to grow freely as long as you have enough address space available.
Another valid question could be: Isn't the program supposed to decrease/increase the stack pointer itself? How can an architecture impose one over the other to the programmer? Why it's not so program dependent as it's architecture dependent?
While you can pretty much fight the architecture and somehow get away your stack in the opposite direction, some instructions, notably call and ret that modify the stack pointer directly are going to assume another direction, making a mess.

Nowadays it's largely because it's been done that way for a long time and lots of programs assume it's done that way, and there's no real reason to change it.
Back when dinosaurs roamed the earth and computers had 8kB of memory if you were lucky, though, it was an important space optimization. You put the bottom of the stack at the very top of memory, growing down, and you put the program and its data at the very bottom, with the malloc area growing up. That way, the only limit on the size of the stack was the size of the program + heap, and vice versa. If the stack instead started at 4kB (for instance) and grew up, the heap could never get bigger than 4kB (minus the size of the program) even if the program only needed a few hundred bytes of stack.

Man CLONE : The child_stack argument specifies the location of the stack used by the child process. Since the child and calling process may share memory, it is not possible for the child process to execute in the same stack as the calling process. The calling process must therefore set up memory space for the child stack and pass a pointer to this space to clone(). Stacks grow downward on all processors that run Linux (except the HP PA processors), so child_stack usually points to the topmost address of the memory space set up for the child stack.

On x86, the primary reason the stack grows toward decreasing memory addresses is that the PUSH instruction decrements the stack pointer:
Decrements the stack pointer and then stores the source operand on the top of the stack.
See p. 4-511 in Intel® 64 and IA-32 ArchitecturesSoftware Developer’s Manual.

Related

Assembly - How to modify stack size?

I am a newbie in assembly programming and I am using push and pop instructions that use the memory stack.
So, What is the stack default size, How to modify it and What is the limit if its size?
Stack size depends upon a lot of factors.
It depends on where you start the stack, how much memory you have, what CPU you are using etc.
The CPU you are using is not called a "Windows CPU".
If you are specifying what CPU you are using, you specify the name of that CPU in detail and also, very important, the architecture of the CPU. In this case, you are probably using x86 architecture.
Here is a memory map for x86 architecture:
All addresses Before 0X100000 - Free
0x100000 - 0xc0000 - BIOS
0xc0000 - 0xa0000 - Video Memory
0xa0000 - 0x9fc00 - Extended BIOS data area
0x9fC00 - 0x7e00 - Free
0x7e00 - 0x7c00 - Boot loader
0x7c00 - 0x500 - Free
0x500 - 0x400 - BIOS data area
0x400 - 0x00 - Interupt vector table
In x86, stack information is held by two registers:
Base pointer (bp): Holds starting address of the stack
Stack pointer (sp): Holds the address in which next value will be stored
These registers have different names in different modes:
`Base pointer Stack pointer`
16 bit real mode: bp sp
32 bit protected mode: ebp esp
64 bit mode: rbp rsp
When you set up a stack, stack pointer and base pointer gets the same address.
Stack is setup in the address specified in base pointer register.
You can set up your stack anywhere in memory that is free and the stack grows downwards.
Each time you "push" something on to the stack, the value is stored in the address specified by stack pointer (which is same as base pointer at the beginning), and the stack pointer register is decremented.
Each time you "pop" something from the stack, the value stored in address specified by stack pointer register is stored in a register specified by the programmer and the stack pointer register is incremented.
In 16 bit real mode, you "push" and "pop" 16 bits. So each time you "push" or "pop", The stack pointer register is decremented or incremented by 0x02, since each address holds 8 bits..
In 32 bit protected mode, you "push" and "pop" 32 bits. So each time you "push" or "pop", The stack pointer register is decremented or incremented by 0x04, since each address holds 8 bits.
You will have to setup the stack in the right place dpending upon how many values you are going to be "pushing".
If you keep "pushing" your stack keeps growing downwards and at some point of time your stack may overwrite something. So be wise and set up the stack in a address in the memory where there is plenty of room for the stack to grow downwards.
For example:
If you setup your stack at 0x7c00, just below the bootloader and you "push" too many values, your stack might overwrite the BIOS data area at some point of time which causes a lot of errors.
You should have a basic idea of a stack and the size of it by now.
Whatever loaded ("the loader") your program into memory, and passed control to it, determines where in memory the stack is located, and how much space is available for the stack.
It does so by the simple artifice of loading the stack pointer, typically using a MOV ESP, ... instruction before calling/jumping to your code. Your program then uses the stack area supplied.
If your program uses too much, it will write beyond the end of the allocated stack area. This is a program bug, because the memory past the end may be allocated for some other purpose in the application. Writing on that other memory is likely to change the program behavior (e.g., "bug") when that memory gets used, and finding the cause of that bug is likely to be difficult (people assume that stacks don't damage program data and vice versa).
If your application wants to use a larger stack, generally all you have to do is allocate your own area, large enough for your purposes, and do a MOV ESP, ... yourself to set the stack to the chosen location. How you allocate an area depends on the execution environment in which you run. (You need to respect ESP conventions: must be a multiple of 4, should be initialized to the bottom of a cache line, often useful to initialize to the bottom of virtual memory page).
It is generally a good idea when "switching" stacks to save the old value of ESP provided by the loader, and restore ESP to that old value before returning control to the loader/caller/OS. Likewise, you should free the extended stack space no longer being used.
This scheme will work if you know the amount of stack space you need in advance. In practice, this is rather hard to "guess" (and may be impossible if your code has a recursive algorithm that nests deeply). So you can either pick a really huge number bigger than you need (ick) or you can use an organized approach to switch stacks when it is clear to the program that it needs more.
See How does a stackless language work? for more discussion.

Is it okay to use dictionary memory without 'allot'?

I am doing a programming exercise where I'm trying to do the same thing in different ways. (I happen to be adding two 3 element vectors together in Forth). In one of my revisions I used the return stack to store temporary values (so I am using that feature), but in addition to that I am considering using un-allocated memory as temporary storage.
I created two words to access this memory:
: front! here + ! ;
: front# here + # ;
I tried it in my experiment, and it seemed to work for what I was doing. I don't have any intention to use this memory after my routines are done. And I am living in dictionary, of which memory has already been given to the program.
But, my gut still tells me that this is a bad thing to do. Is this such a bad thing?
If it matters, I'm using Gforth.
Language-lawyer strictly speaking, no. ANS Forth 3.3.3.2 states:
A program may perform address arithmetic within contiguously allocated regions.
You are performing address arithmetic outside any allocated region.
However, it might be perfectly fine in some particular implementation. Such as gforth.
Note that there is a word called PAD, which returns an address to a temporary memory region.
It's okay if you know what you are doing, bud PAD is a better place than HERE to do it. There is also the alternative ALLOCATE and FREE:
ALLOCATE ( u -- a-addr ior )
Allocate u address units of contiguous data space. The data-space
pointer is unaffected by this operation. The initial content of the
allocated space is undefined.
If the allocation succeeds, a-addr is the aligned starting address of
the allocated space and ior is zero.
If the operation fails, a-addr does not represent a valid address and
ior is the implementation-defined I/O result code.
FREE ( a-addr -- ior )
Return the contiguous region of data space indicated by a-addr to the
system for later allocation. a-addr shall indicate a region of data
space that was previously obtained by ALLOCATE or RESIZE. The
data-space pointer is unaffected by this operation.
If the operation succeeds, ior is zero. If the operation fails, ior is
the implementation-defined I/O result code. American National Standard for Information Systems
: front! here + ! ;
What's the stack diagram? I guess ( n offset_in_cells -- )?

Confused from x86 memory Layout of kernel loader

I am new to Linux kernel stuff and is reading about memory layout of Kernel loader but confused with below given diagram
0A0000 +------------------------+
| Reserved for BIOS | Do not use. Reserved for BIOS EBDA.
09A000 +------------------------+
| Command line |
| Stack/heap | For use by the kernel real-mode code.
098000 +------------------------+
| Kernel setup | The kernel real-mode code.
090200 +------------------------+
| Kernel boot sector | The kernel legacy boot sector.
090000 +------------------------+
| Protected-mode kernel | The bulk of the kernel image.
010000 +------------------------+
| Boot loader | <- Boot sector entry point 0000:7C00
001000 +------------------------+
| Reserved for MBR/BIOS |
000800 +------------------------+
| Typically used by MBR |
000600 +------------------------+
| BIOS use only |
Now statement explaining this diagram is bit confusing for me.
When using bzImage, the protected-mode kernel was relocated to 0x100000 ("high memory"), and the kernel real-mode block (boot sector,setup, and stack/heap) was made relocatable to any address between 0x10000 and end of low memory.
Now first thing where is 0x100000 address is in above diagram ??
Second thing is when its says kernel real-mode block was made relocatable to "any address between 0x10000 and end of low memory" means it was relocatable to address between 0x10000 to 000600?
Intially kernle mode block is placed between 0x10000 to 09A000.
"it is desirable to keep the "memory ceiling" -- the highest point in low memory touched by the boot loader -- as low as possible, since some newer BIOSes have begun to allocate some rather large amounts of memory, called the Extended BIOS Data Area, near the top of low memory".
when its says low memory means memory downside towards 000600 and high memory upside towards 0A0000??
Now first thing where is 0x100000 address is in above diagram ??
0x100000 is not on the diagram because only the first megabyte is special. Beyond that point the physical memory is contiguous at least until the 15-16MB point.
Second thing is when its says kernel real-mode block was made relocatable to "any address between 0x10000 and end of low memory" means it was relocatable to address between 0x10000 to 000600?
Real-mode code can live anywhere below approximately 1 MB and the end is probably around there, at 0x9A000 or wherever the EBDA begins.
when its says low memory means memory downside towards 000600 and high memory upside towards 0A0000??
You have it on the diagram, from 0xA0000 downwards, towards 0.

What is the difference between STATUS_STACK_BUFFER_OVERRUN and STATUS_STACK_OVERFLOW?

I just found out that there is a STATUS_STACK_BUFFER_OVERRUN and a STATUS_STACK_OVERFLOW. What's the difference between those 2? I just found Stack overflow (stack exhaustion) not the same as stack buffer overflow but either it doesn't explain it or I don't understand it. Can you help me out?
Regards
Tobias
Consider the following stack which grows downward in memory:
+----------------+
| some data | |
+----------------+ | growth of stack
| 20-byte string | V
+----------------+
limit of stack
A buffer overrun occurs when you write 30 bytes to your 20-byte string. This corrupts entries further up the stack ('some data').
A stack overflow is when you try to push something else on to the stack when it's already full (where it says 'limit of stack'). Stacks are typically limited in their maximum size.
Stackoverflow appears when there is no more space in memory to allocate your data, and buffer overrun a.k.a. buffer overflow is called when program overruns buffer boundary and writes/overwrites data in unexpected part of memory (takes more memory than expected).
Easily, you can understand this just by reading description of tags stackoverflow and buffer overflow.

thread stack size on Windows (Visual C++)

Is there a call to determine the stack size of a running thread? I've been looking in MSDN thread functions documentation, and can't seem to find one.
Whilst there isn't an API to find out stack size directly, contiguous virtual address space must be reserved up to the maximum stack size - it's just that a lot of that space isn't committed yet. You can take advantage of this and make two calls to VirtualQuery.
For the first call, pass it the address of any value on the stack to get the base address and size, in bytes, of the committed stack space. On an x86 machine where the stack grows downwards, subtract the size from the base address and VirtualQuery again: this will give you the size of the space reserved for the stack (assuming you're not precisely on the limit of stack size at the time). Summing the two naturally gives you the total stack size.
You can get the current committed size from the Top and Bottom in the TEB. You can get the process initial reserve and commit sizes from the PE header. But you cannot retrieve the actual sizes passed to CreateThread, nor is there any API to get the remaining size of reserved nor committed from current stack, see Thread Stack Size.

Resources