Always I refer to x86 (Linux)
Are logical addresses created during the generation of a binary?
If yes, are their inside the binary?
Thanks
The LINKER defines the initial layout of the processes user address space. The linker then defines the range of logical addresses and their page attributes (read or read/write, execute or no execute).
The user area of the logical address space gets set up by the program loader when the executable is run.
The answer to your question
Are logical addresses created during the generation of a binary?
then depends upon you mean "created" to be when the logical address space is defined (linker) or whether you mean when it is set up (program loader).
In x86, a logical address (also called a far pointer) consists of a 16-bit segment selector and a 16/32/64-bit offset (also called a near pointer). The size of the offset depends on the operating mode, the code segment descriptor, and the address size prefix. Then the segment selector is used to obtain the segment base address (or it's obtained from the segment descriptor cache except when operating in 64-bit mode in which the base address is considered to be zero for all segments except for FS and GS) to be added to the offset to form a virtual address. The x86 ISA offers no way to completely skip that process. So any x86 instruction must specify the two parts that constitute the logical address separately (implicitly or explicitly).
Are logical addresses created during the generation of a binary?
An x86 binary contains x86 instructions. Each instruction specifies which segment register to use and how to calculate the offset (using stuff like base, index, scale, and displacement). At run-time, when an instruction is being executed, the offset is calculated and the segment selector value is determined. So, technically, x86 instructions only tell the CPU where to get the segment selector from and how to calculate the offset, but it is the CPU that generates the logical address. Generally, the compiler and the OS determine the values of offsets, but only the OS controls the values of the segment selectors.
If yes, are their inside the binary?
x86 instructions may specify the offset as an immediate value (constant). The segment part can be either specified as an immediate value (far call or var jump), fetched from a segment register, or fetched from memory (far return). So the value of the offset might be in the binary encoded with the instruction that uses it, but the value of the segment selector might not.
Related
Afaik they are never used and CS=DS=SS nowadays. However if I were to set these values, would anything change or does the processor ignore them. Ive found really conflicting information on the question and I don't understand why they would still be there if they are ignored. Help pls
Yes, the segment registers do still affect code execution.
The question and some of the comments don't seem to distinguish between the selector value and the base address. To clearly understand some of the apparently conflicting information you're reading on this topic, you need to make sure you recognize which one is being discussed.
The CS selector cannot be 0. It must refer to a valid code segment descriptor in the GDT or LDT. The L bit of the code segment descriptor controls whether the current process is 64-bit mode or 32-bit compatibility mode.
CS (the selector) cannot be equal to DS and SS. CS must refer to a code segment, whereas DS and SS must refer to data segments (possibly the same one). The DS and SS selectors are allowed to be 0 (which would cause a GP fault in 32-bit mode).
The main aspect of segment registers that doesn't still have an effect is the base address and segment limit; the base address of CS, DS, ES, and SS are all treated as if they are 0, and there are no segment limit checks in 64-bit code.
This is the reason you see people saying that they are ignored.
As Margaret mentioned, the current privilege level (CPL) is in the low 2 bits of the CS and SS selector registers and also in the DPL bits of the descriptors in the GDT. These bits should be either 0 or 3, since no current operating systems use rings 1 and 2, as far as I know.
One other minor point is that certain faults caused by memory accesses are reported as stack faults instead of GP faults, if the memory access is performed using the SS segment (because RBP or RSP is used as a base register in the instruction operand).
As far as my understanding goes, the CPU always generates a virtual address that is made-up of 2 parts- the page number and the page offset. The page number is used for indexing the page table (the corresponding mapping gives the starting address of the frame in the RAM). Now, please consider the following questions. Consider that the word size of the machine is 4 bytes, and the page size is equal to the frame size = 4096 bytes.
Supposing that the page number is 4 and the offset is 3. Then Page 4 in the logical memory maps to frame 8 in the virtual memory. This means that the starting address of the frame is 8.
Now, each frame will contain 4096/4= 1024 words. Does the offset imply for a word inside the frame, since the machine will always fetch a word at a time? What I mean is that does it mean the 3rd word in frame 8?
Is the particular word given to the CPU, or the entire frame? If former, then why does everyone talk about transfer in terms of frames and pages rather than words?
Suppose a page fault occurs. What this means is that the particular page in not in memory. Does it mean that the physical address mapped contains some other page? Does the mapping even exist in such a case when the invalid bit is 1.
Can someone clear-up things for me? One moment I seem to get it, and the very next, I get into a maze.
The key point of paging is that it deals with "chunks" of memory.
It is a map, a function, that translates virtual addresses into physical addresses but not on an address-by-address base. Rather, a "chunk" of continuous virtual addresses is translated together into another continuous "chunk" of, now physical, addresses.
You can think of it as a "translation" or "shuffle" of "chunks" of memory.
The correct term for "chunk" is page.
If try to do a sample mapping you can see that each page contains a set of addresses that all have a peculiarity: their lower bits don't change when passing from virtual to physical. The upper bits instead are arbitrary.
This dichotomy of the address value defines the Offset and Page/Frame number.
The offset is the part of the address value that don't undergo any translation.
In a page of 4KiB there are 4096 addresses, each one with its offset, so the offset has size log2(4096) = log2(212) = 12 * log2(2) = 12 bits.
In short the page size determines the offset size.
It is necessary to break the memory into pages and not words or byte, or in another view it is necessary to group the addresses to translate into pages.
Without pages, the metadata used for the translation, in jargon the page tables of various level, would occupy more memory that the one that under translation!
Offsets are relative to their page/frame thanks to the way their are defined: the offset 1024 (in hex 400h) in the frame 8 means the address 8000h + 400h = 8400h; if the page is mapped to the frame 12 the offset 1024 is still 1024 bytes after the beginning of the frame, 0c000h + 400h = 0c400h.
Being an address, an offset usually denotes a byte, event in architecture where bytes are not addressable. However this is not a standard convention, to know if an offset denotes a word or a byte (e.g. if offset 10 of frame 0 is the byte 40 or the byte 10) check the architecture manual. The first sections are usually dedicated to establishing a terminology to use throughout the book.
Paging happens before the CPU accesses the memory, you can think of it as an high level process. The unit that accesses the memory/bus is mostly unaware of it, as such the CPU read the data that the instruction is telling it to read (a word, a byte, and so on).
People talk about moving a page because a page is the smallest unit that can be characterized.
You can mark a page as non present, but not a word. You can make a page read-only but not a word.
If you need to map, say 16 bytes, you still need to map a whole page since 16 bytes are not characterizable. So we might as well read a whole page.
When a page-fault occurs it means that the page accessed is, at any level in the page-tables, non present.
This may mean a wide range of things, from the fact that the Present bit has been simply toggled (with the page still there), to the fact that the page has been saved to disk and zero-ed in memory.
Since the mapping function is total, meaning that every value is a valid value, the CPU need a way to know when a value is not valid.
The Present bit does this: tell the CPU that a translation must not be performed and that an exception must be raised instead.
The OS use this exception to be notified of when a page is needed, it doesn't need to reassign the mapping to another page or zero the memory.
When people say that a page is removed they mean that it is removed from the mapping, all modern OSes also zero-d the page to prevent leaking of information to other processes though.
So if a physical frame is not mapped it doesn't mean that another page in another process is mapping it, it simply mean that that range of addresses cannot be accessed.
As said above there are a lot of reasons for an OS to do this, including protection.
You have things a bit backwards. The operating system defines a logical address space for each process. The logic address space is divided into units of memory called PAGES.
The operating system logically maps the pages of the address to either physical page frames or secondary storage If the operating system maps pages to secondary storage then is using virtual memory.
In ye olde days all systems that did logical memory translation always did virtual memory mappings to secondary storage. That is why the terms virtual memory translation and logical memory translation are often conflated. These days it is becoming increasingly common to have logical translation without virtual memory.
All address accesses through a process are to logical addresses. The processor translates the logical address to page frames. If logical page exists but is mapped to secondary storage, accessing that page triggers a page fault. The operating system must handle the fault, remap the logical/virtual page to a physical page frame; load the data from secondary storage to the page frame; and restart the instructions.
Supposing that the page number is 4 and the offset is 3. Then Page 4 in the logical memory maps to frame 8 in the virtual memory. This means that the starting address of the frame is 8.
This make no sense. A logical page is virtual when it is mapped to secondary storage. If the page number is 4 the 4th logical page can:
a) have no mapping at all (access violation)
b) map to a physical page frame
c) map to a secondary storage (virtual memory)
Now, each frame will contain 4096/4= 1024 words. Does the offset imply for a word inside the frame, since the machine will always fetch a word at a time? What I mean is that does it mean the 3rd word in frame 8?
In nearly all (if not all) current processors there are no memory words; only bytes. The system bus fetches memory and the "word size" of the bus can be (and often is) different from the "word size" of the processor.
Is the particular word given to the CPU, or the entire frame? If former, then why does everyone talk about transfer in terms of frames and pages rather than words?
The process sees transfers in sizes related to the instruction being executed. The operand size can be larger or smaller than the machine word. The bus transfers data to memory and that size is frequently different from the word size of the machine.
Suppose a page fault occurs. What this means is that the particular page in not in memory. Does it mean that the physical address mapped contains some other page? Does the mapping even exist in such a case when the invalid bit is 1.
I gave the three possibilities for logical page mappings above. How those are indicated are system specific. Some systems use 2 bits to indicate a, b, or c. Others use a single bit to indicate (b) and require the operating system to determine whether it's (a) or (c).
Whether or not a page fault is triggered depended upon the state of the page table.
Generally a page fault means that the page frame is not in memory. However, it is often possible for the physical page frame to be in memory but not mapped in the page table (a soft page fault). (This occurs when the operating system has unmapped page frames to free some up but has not reallocated them.) In this case, the operating system simply needs to update the page table to point to the page frame and restart the instruction (no need to load from secondary storage).
The reason this gets me confused is that all addresses hold a sequence of 1's and 0's. So how does the CPU differentiate, let's say, 00000100(integer) from 00000100(CPU instruction)?
First of all, different commands have different values (opcodes). That's how the CPU knows what to do.
Finally, the questions remains: What's a command, what's data?
Modern PCs are working with the von Neumann-Architecture ( https://en.wikipedia.org/wiki/John_von_Neumann) where data and opcodes are stored in the same memory space. (There are architectures seperating between these two data types, such as the Harvard architecture)
Explaining everything in Detail would totally be beyond the scope of stackoverflow, most likely the amount of characters per post would not be sufficent.
To answer the question with as few words as possible (Everyone actually working on this level would kill me for the shortcuts in the explanation):
Data in the memory is stored at certain addresses.
Each CPU Advice is basically consisting of 3 different addresses (NOT values - just addresses!):
Adress about what to do
Adress about value
Adress about an additional value
So, assuming an addition should be performed, and you have 3 Adresses available in the memory, the application would Store (in case of 5+7) (I used "verbs" for the instructions)
Adress | Stored Value
1 | ADD
2 | 5
3 | 7
Finally the CPU receives the instruction 1 2 3, which then means ADD 5 7 (These things are order-sensitive! [Command] [v1] [v2])... And now things are getting complicated.
The CPU will move these values (actually not the values, just the adresses of the values) into its registers and then processing it. The exact registers to choose depend on datatype, datasize and opcode.
In the case of the command #1 #2 #3, the CPU will first read these memory addresses, then knowing that ADD 5 7 is desired.
Based on the opcode for ADD the CPU will know:
Put Address #2 into r1
Put Address #3 into r2
Read Memory-Value Stored at the address stored in r1
Read Memory-Value stored at the address stored in r2
Add both values
Write result somewhere in memory
Store Address of where I put the result into r3
Store Address stored in r3 into the Memory-Address stored in r1.
Note that this is simplified. Actually the CPU needs exact instructions on whether its handling a value or address. In Assembly this is done by using
eax (means value stored in register eax)
[eax] (means value stored in memory at the adress stored in the register eax)
The CPU cannot perform calculations on values stored in the memory, so it is quite busy moving values From memory to registers and from registers to memory.
i.e. If you have
eax = 0x2
and in memory
0x2 = 110011
and the instruction
MOV ebx, [eax]
this means: move the value, currently stored at the address, that is currently stored in eax into the register ebx. So finally
ebx = 110011
(This is happening EVERYTIME the CPU does a single calculation!. Memory -> Register -> Memory)
Finally, the demanding application can read its predefined memory address #2,
resulting in address #2568 and then knows, that the outcome of the calculation is stored at adress #2568. Reading that Adress will result in the value 12 (5+7)
This is just a tiny tiny example of whats going on. For a more detailed introduction about this, refer to http://www.cs.virginia.edu/~evans/cs216/guides/x86.html
One cannot really grasp the amount of data movement and calculations done for a simple addition of 2 values. Doing what a CPU does (on paper) would take you several minutes just to calculate "5+7", since there is no "5" and no "7" - Everything is hidden behind an address in memory, pointing to some bits, resulting in different values depending on what the bits at adress 0x1 are instructing...
Short form: The CPU does not know what's stored there, but the instructions tell the CPU how to interpret it.
Let's have a simplified example.
If the CPU is told to add a word (let's say, an 32 bit integer) stored at the location X, it fetches the content of that address and adds it.
If the program counter reaches the same location, the CPU will again fetch this word and execute it as a command.
The CPU (other than security stuff like the NX bit) is blind to whether it's data or code.
The only way data doesn't accidentally get executed as code is by carefully organizing the code to never refer to a location holding data with an instruction meant to operate on code.
When a program is started, the processor starts executing it at a predefined spot. The author of a program written in machine language will have intentionally put the beginning of their program there. From there, that instruction will always end up setting the next location the processor will execute to somewhere this is an instruction. This continues to be the case for all of the instructions that make up the program, unless there is a serious bug in the code.
There are two main ways instructions can set where the processor goes next: jumps/branches, and not explicitly specifying. If the instruction doesn't explicitly specify where to go next, the CPU defaults to the location directly after the current instruction. Contrast that to jumps and branches, which have space to specifically encode the address of the next instruction's address. Jumps always jump to the place specified. Branches check if a condition is true. If it is, the CPU will jump to the encoded location. If the condition is false, it will simply go to the instruction directly after the branch.
Additionally, the a machine language program should never write data to a location that is for instructions, or some other instruction at some future point in the program could try to run what was overwritten with data. Having that happen could cause all sorts of bad things to happen. The data there could have an "opcode" that doesn't match anything the processor knows what to do. Or, the data there could tell the computer to do something completely unintended. Either way, you're in for a bad day. Be glad that your compiler never messes up and accidentally inserts something that does this.
Unfortunately, sometimes the programmer using the compiler messes up, and does something that tells the CPU to write data outside of the area they allocated for data. (A common way this happens in C/C++ is to allocate an array L items long, and use an index >=L when writing data.) Having data written to an area set aside for code is what buffer overflow vulnerabilities are made of. Some program may have a bug that lets a remote machine trick the program into writing data (which the remote machine sent) beyond the end of an area set aside for data, and into an area set aside for code. Then, at some later point, the processor executes that "data" (which, remember, was sent from a remote computer). If the remote computer/attacker was smart, they carefully crafted the "data" that went past the boundary to be valid instructions that do something malicious. (To give them more access, destroy data, send back sensitive data from memory, etc).
this is because an ISA must take into account what a valid set of instructions are and how to encode data: memory address/registers/literals.
see this for more general info on how ISA is designed
https://en.wikipedia.org/wiki/Instruction_set
In short, the operating system tells it where the next instruction is. In the case of x64 there is a special register called rip (instruction pointer) which holds the address of the next instruction to be executed. It will automatically read the data at this address, decode and execute it, and automatically increment rip by the number of bytes of the instruction.
Generally, the OS can mark regions of memory (pages) as holding executable code or not. If an error or exploit tries to modify executable memory an error should occur, similarly if the CPU finds itself trying to execute non-executable memory it will/should also signal an error and terminate the program. Now you're into the wonderful world of software viruses!
Im reading the book "Write great code: understanding the machine" by Randall Hyde, is a great and clear text but here im completely stuck with his explanation of, for example, the mov instruction.
He dissects the steps for the mov(srcReg,destMem) instruction as follows:
1. Fetch the instruction's opcode from memory.
2. Update the EIP register with the address of the byte following the opcode.
3. Decode the instruction's opcode to see what instruction it specifies.
4. Fetch the displacement associated with the memory operand from the memory location immediately
following the opcode.
5. Update EIP to point at the first byte beyond the operand that follows the opcode.
6. If the mov instruction uses a complex addressing mode (for example, the indexed addressing mode),compute the effective address of the destination memory location.
7. Fetch the data from srcReg.
8. Store the fetched value into the destination memory location.
Im lost in steps 4-6. My exact questions are:
Step 4: Why do I need this displacement, how Im gonna use it later and why?
Step5: I understand that in step 2, the EIP must "point" to the next byte where the next instruction to be executed is stored. But I dont understand why does EIP needs to be one byte beyond the operand address. I belived that EIP was concerned only with instructions/opcodes, not data.
Step6: What is exactly and effective address? Are there other types of address?
Step 4:
Some opcodes reference memory that's relative to the opcode's location. For example, a function might have a constant or static piece of data. If it does, the code may opt to place that right before the function starts (or right after it ends) and refer to it by saying "get the memory from 46 bytes earlier". That's the displacement -- it's an offset from the contents of a register (in this case, EIP), used for referencing data relative to the register's contents.
Step 5
The operands for opcodes are normally stored right after the opcode. So you might have some memory arranged like so: a b c. a is and opcode, b is the operand for a and c is the next opcode.
If you only move EIP to the end of a (so it references b), then in the next instruction cycle, the computer will assume that b is the next opcode to execute. b isn't supposed to be an opcode though; it's an operand. The computer can't tell the difference between an opcode and an operand though. It just assumes whatever EIP points to is an instruction and executes it. That's why EIP needs to be moved past the operand too.
Step 6
An "effective" address is just an absolute one (relative to the start of memory) while the "complex" address the book refers to is relative to something else (often the contents of a register).
Step 4 showed that an opcode might not refer to an absolute memory address. It could easily refer to a relative one. In fact, programs very frequently refer to addresses that are relative to some register. For example, if you wrote some_struct.data in C and compiled it for an x86 processor, it would load the address of some_struct into a register (say, EAX), then hard-code data's offset from the base of some_struct into the operand. So if there are 5 bytes of data between the start of the struct and the start of the data element, then the instruction might look like load [EAX + 5] -> EBX which means "take what's in EAX, add 5, fetch the data from that address and put it in EBX".
The thing is, the memory doesn't really understand relative addresses like this. It only understands absolute ones. So in order to access a relative address, the processor has to first add that 5 to whatever's in EAX to compute an absolute address. Then it can send that address to the memory controller and have it understood.
There are two basic types of relative addresses I've worked with (there are more I haven't).
Register relative: The processor takes the contents of a register and uses that as the address in memory. Depending on the opcode and processor support, it may also add an operand to the register as well. Step 4 was dealing with this kind of addressing, with EIP as the register the address was relative to.
Memory relative: Sometimes referred to as "indirect". The processor starts out with a register relative address, then automatically fetches the data at that address and treats it as the real address.
Wikipedia describes lots of other addressing modes on their addressing modes page.
Memory relative took me a while to understand. Say you did a memory relative load where the register contains 10 and the offset is 5. The processor will add them together (10 + 5 = 15). Then, it'll go to that address (15 in this case) and grab whatever's there. If address 15 happens to contain the value 60, then 60 will be treated as the actual address and the processor will load the contents of address 60. If you're familiar with a language with pointers (e.g. C), memory relative is like a pointer-to-a-pointer.
When I study OS,I find a concept Logical Memory.So Why there is a need for a Logical Memory?How does a CPU generate Logical Memory?The output of "&ptr" operator is Logical or physical Address?Is Logical Memory and Virtual Memory same?
If you're talking about C's and C++'s sizeof, it returns a size and never an address. And the CPU does not generate any memory.
On x86 CPUs there are several layers in address calculations and translations. x86 programs operate with logical addresses comprised of two items: a segment selector (this one isn't always specified explicitly in instructions and may come from the cs, ds, ss or es segment register) and an offset.
The segment selector is then translated into the segment base address (either directly (multiplied by 16 in the real address mode and in the virtual 8086 mode of the CPU) or by using a special segment descriptor table (global or local, GDT or LDT, in the protected mode of the CPU), the selector is used as an index into the descriptor table, from where the base address is pulled).
Then the sum segment base address + offset forms a linear address (AKA virtual address).
If the CPU is in the real address mode, that's the final, physical address.
If the CPU is in the protected mode (or virtual 8086), that linear/virtual address can be further translated into the physical address by means of page tables (if page translation is enabled, of course, otherwise, it's the final physical address as well).
Physical memory is your RAM or ROM (or flash). Virtual memory is physical memory extended by the space of disk storage (could be flash as well as we now have SSDs).
You really need to read up on this. You seem to have no idea.