Ask for clarification about "the segment registers continue to point to the same linear addresses as in real address mode" [duplicate] - segment

This question already has an answer here:
How can the x86 processor fetch the instruction just after GDT is loaded by a bootloader?
(1 answer)
Closed 1 year ago.
The question is about persistent validity of code segment selector while switching from real mode to protected mode on intel i386. The switching code is as follows (excerpted from bootasm.S of xv6 x86 version):
9138 # Switch from real to protected mode. Use a bootstrap GDT that makes
9139 # virtual addresses map directly to physical addresses so that the
9140 # effective memory map doesn’t change during the transition.
9141 lgdt gdtdesc
9142 movl %cr0, %eax
9143 orl $CR0_PE, %eax
9144 movl %eax, %cr0
9150 # Complete the transition to 32−bit protected mode by using a long jmp
9151 # to reload %cs and %eip. The segment descriptors are set up with no
9152 # translation, so that the mapping is still the identity mapping.
9153 ljmp $(SEG_KCODE<<3), $start32
The GDT layout is as follows:
9182 gdt:
9183 SEG_NULLASM # null seg
9184 SEG_ASM(STA_X|STA_R, 0x0, 0xffffffff) # code seg
9185 SEG_ASM(STA_W, 0x0, 0xffffffff) # data seg
After executing line 9144, the processor switches to protected mode in which mere segment memory management is enabled (but paging has not yet been enabled). My understanding is that, since segment MM has been enabled, the fetching of the following instruction should conform to the rules of segment MM. At this point (immediately before line 9153), however, the code selector remains 0, which in my understanding means the code segment should have selected the zero-th descriptor in GDT, which is null. But my question comes out naturally, how such a null descriptor can load the supposed ljmp instruction? I tried to answer my question by googling, and a document gives some explanation as follows: http://www.logix.cz/michal/doc/i386/chp10-03.htm#10-03
The segment registers continue to point to the same linear addresses
as in real address mode
This sentence seems to answer my question: if the segment registers continue to point to the same linear addresses, the next instruction should be the same as in real mode, that is, ljmp. But I immediately have a sequence of new questions: why can the segment selector "continue to point to the same linear addresses"? Hasn't the processor been changed to protected mode? Doesn't the value of 0 in %cs point to the zero-th descriptor, instead of the 1st (set in line 9184) which is the supposed descriptor to fetch ljmp instruction? How does the x86 CPU magically know it is the ljmp that is the next instruction it should execute? Where is the description in any manual that describe this magic? I tried to persuade myself that the ljmp has been prefetched in the processor's instruction queue, but the second paragraph of the same webpage tells me that the prefetched ljmp, if any, has been invalidated so the CPU should fetch the next instruction afresh. Can you please give me some clarification of how "the segment registers continue to point to the same linear addresses as in real address mode" magically? Thank you.
PS, the CPU I am working on is intel i386 compatible.

The modern reference is the Intel Software Developer's Manual, Volume 3A, Section 9.9.1, "Switching to protected mode".
Intel isn't big on explaining how magic works internally. What it says, and all you need to know, is that if your movl %eax, %cr0 is immediately followed by a far jump or far call, then everything will work. If you put any other instruction there, then "random failures can occur" (their wording).
As it says, %cs continues to hold its previous value, and presumably that's the value that would be pushed on the stack if you did a far call as the instruction after movl %eax, %cr0. (Where the stack would be is another interesting question - I think everyone uses the jump instead so it rarely comes up.) But for this one instruction it evidently isn't used as a selector in the usual way.
One guess as to how it might work: we know that in protected mode, there are hidden registers that store the segment attributes, and are reloaded from the descriptor table when you load a segment register. So the movl %eax, %cr0 might cause the hidden register corresponding to %cs to be loaded with attributes of a segment whose base address is the linear address of the current 16-bit segment: e.g. if %cs contained 0x1234 then it could be a segment with base address 0x12340. But the %cs register itself could be left alone, temporarily not matching its hidden counterpart. Then if the high bits of %eip are zeroed, the next instruction would be fetched from the right place. That instruction is required to be the long jump which will reload %cs as well as the hidden segment attribute register.
It's also possible that it just sets some internal flag that says "even though in protected mode, fetch the next instruction according to real-mode address translation". Then this flag gets cleared when a far jump occurs, or after one instruction has been fetched, or something like that.

Related

When kernel stack's esp is stored to TSS for interrupt return iret?

When I read Intel's X86 programmer's manual, see the following for interrupt & interrupt return with stack switching:
interrupt:
If a stack switch does occur, the processor does the following:
Temporarily saves (internally) the current contents of the SS, ESP, EFLAGS, CS, and EIP registers.
Loads the segment selector and stack pointer for the new stack (that is, the stack for the privilege level being called) from the TSS into the SS and ESP registers and switches to the new stack.
Pushes the temporarily saved SS, ESP, EFLAGS, CS, and EIP values for the interrupted procedure’s stack onto the new stack.
Pushes an error code on the new stack (if appropriate).
Loads the segment selector for the new code segment and the new instruction pointer (from the interrupt gate or trap gate) into the CS and EIP registers, respectively.
If the call is through an interrupt gate, clears the IF flag in the EFLAGS register.
Begins execution of the handler procedure at the new privilege level.
On return:
Performs a privilege check.
Restores the CS and EIP registers to their values prior to the interrupt or exception.
Restores the EFLAGS register.
Restores the SS and ESP registers to their values prior to the interrupt or exception, resulting in a stack switch back to the stack of the interrupted procedure.
Resumes execution of the interrupted procedure.
For example, one linux process P:
It's initially in kernel mode
It returns to user mode by iret. But from the manual, there is no change to TSS
It traps into kernel by int. Here it needs to find the kernel stack from ESP & SS in TSS. How is this kernel stack value set up, since they are not stored to TSS in step 2?
Once the kernel returns to user-space for a given task, it's done with that task's kernel stack until the next interrupt / exception. There's no useful data on it, so the TSS can hold a fixed SS:[ER]SP value that points to the top of the virtual page[s] allocated as the kernel stack for the current task.
Kernel state doesn't live on the kernel stack between entries into the kernel; it's kept elsewhere in a process control block. (Context switches between asks actually happen in the kernel, switching kernel stacks to the formerly-sleeping task's kernel stack, so eventually returning to user-space means returning up the call-chain of whatever that task was doing in the kernel first).
BTW, unless the kernel pushes a new CS:EIP / EFLAGS / SS:ESP for iret to pop, the stuff it pops will be the stuff pushed by hardware at the address specified in the TSS. So even if there was some desire to re-enter the kernel with the stack as you left it, that would normally be at the TSS location anyway. But this is irrelevant because Linux doesn't keep stuff on a task's kernel stack while user-space is running, except for a pointer to per-task stuff at the bottom of the region where the kernel can find it with [ER]SP & -16384.
(I think this is right; I've looked at a few bits of Linux kernel code but haven't really gotten my hands dirty experimenting with things. I think this is how Linux works, and a consistent viable design.)

On what logic ASLR changes the memory addresses on a file's Assembly code?

I am patching an exe file using OllyDbg and I am accessing a specific memory address this way MOV EAX, DWORD PTR DS:[00DE3DA0] at two locations. The first location is at an instruction I've replaced somewhere in the middle of the file, the other one is at the very bottom where were some empty spaces that I could use for new instructions. My issue is that after ASLR occurs/after windows restart, the bottom instruction's memory address won't be changed according to the new address layout thus my read will be incorrect there, but at the other location the address will be automatically set to the correct one by the ASLR and my code will always work there. Also my newest observation is that this memory regeneration only happens to my code if the instruction I am replacing included reading / writing to an other memory address in DS like DS:[xxxxxxxx].
I am looking for information in what logic does the ASLR decide to regenerate an address? Is it possible to make my bottom code to regenerate like the above one?

What does write_cr0(read_cr0() | 0x10000) do?

I searched the web a lot but didn't find a short explanation about what write_cr0(read_cr0() | 0x10000) really do. It is related to the Linux kernel and I curios about developing LKM's. I want to know what this really do and what are the security issues with this.
It used to remove the write protection on the syscall table.
But how it is really works? and what does each thing in this line?
CR0 is one of the control registers available on x86 CPUs, which contains flags controlling CPU features related to memory protection, multitasking, paging, etc. You can find a full description in Volume 3, Section 2.5 of Intel's Software Developer's Manual.
These registers are accessed by special instructions that the compiler doesn't normally generate, so read_cr0() is a function which executes the instruction to read this register (via inline assembly) and returns the result in a general-purpose register. Likewise, write_cr0() writes to this register.
The function calls are likely to be inlined, so that the generated code would be something like
mov eax, cr0
or eax, 0x10000
mov cr0, eax
The OR with 0x10000 sets bit 16, the Write Protect bit. On early 32-bit x86 CPUs, code running at supervisor level (like the kernel) was always allowed to write all of virtual memory, regardless of whether the page was marked read-only. This bit makes that optional, so that when it is set, such accesses will cause page faults. This line of code probably follows an earlier line which temporarily cleared the bit.

What are the Registers pushed to the stack when an Interrupt Occurs

What are the States Saved by the CPU Automatically when an Interrupt Occur ? And in Which order ?
What are the States Saved by the CPU Automatically when an Interrupt Occur ?
Some registers are saved; this set is defined by CPU architecture. It may be saved to stack, to fixed address in memory or in shadow registers. Usually this set of registers is small, if ISR needs more, it may save them by real code, not CPU automatic hardware. (Check link from Cody Gray at "Interrupt entry/exit.")
And in Which order ?
Order of register saving, when they are pushed to stack is defined by architecture.
For The default architecture, the x86/x86_64 the definition is like (the first link already listed in my answer https://stackoverflow.com/a/38031260/196561 in your previous question, with "*FLAGS, CS, IP" order):
https://en.wikibooks.org/wiki/X86_Assembly/Advanced_Interrupts & iret documentation http://www.felixcloutier.com/x86/IRET:IRETD.html
in real mode, on interrupt hardware pushes FLAGS, push CS, push IP; iret instruction will reload them to return to user.
in protected mode check VM and NT flags in EFLAGS to find how to start and return from interrupt. On hw interrupt (what kind of interrupt you question about) - Check that stack has 10-20 bytes; load SS,eSP,CS:EIP/CS:IP; push "long pointer to old stack", push eflags, push "long pointer to return location"
Actual logic is more complex. There is logic in 386 for entry into interrupt http://intel80386.com/386htm/INT.htm and for iret http://intel80386.com/386htm/IRET.htm ("Operation").

What happens when return address points to a RETN instruction

In the context of exploit development, I end up with this stack:
ESP 5d091399 Return address -- points to a RETN in a DLL
ESP+4 42424242 ASCII 'BBBB' -- labelled "junk" in my tutorial
ESP+8 5209398c Beginning of ROP chain and shellcode
...
In the stack above, the return address has been overwritten with a pointer to a RETN instruction in a DLL. The effect of this is that the stack goes from ESP to ESP+8, where the ROP chain starts.
ESP+4 is never at the top of the stack, hence it's label "junk" and the fact that it can contain any garbage bytes we want. It will have no effect on the execution of the exploit.
How does a return address pointing to a RETN result in the top of the stack going from ESP to ESP+8.
(Or why are the four middle bytes unimportant?)
Note:
The pointer in ESP is definitely to a single RETN on it's own. It is not preceded by a POP.
EDIT:
This tutorial also shows the "junk" (scroll down a tiny bit until exploit code).
This is the same context as my other tutorial (which is on paper, can't link)

Resources