i am really confused on the topic Direct Mapped Cache i've been looking around for an example with a good explanation and it's making me more confused then ever.
For example: I have
2048 byte memory
64 byte big cache
8 byte cache lines
with direct mapped cache how do i determine the 'LINE' 'TAG' and "Byte offset'?
i believe that the total number of addressing bits is 11 bits because 2048 = 2^11
2048/64 = 2^5 = 32 blocks (0 to 31) (5bits needed) (tag)
64/8 = 8 = 2^3 = 3 bits for the index
8 byte cache lines = 2^3 which means i need 3 bits for the byte offset
so the addres would be like this: 5 for the tag, 3 for the index and 3 for the byte offset
Do i have this figured out correctly?
Do i figured out correctly? YES
Explanation
1) Main memmory size is 2048 bytes = 211. So you need 11 bits to address a byte (If your word size is 1 byte) [word = smallest individual unit that will be accessed with the address]
2) You can calculating tag bits in direct mapping by doing (main memmory size / cash size). But i will explain a little more about tag bits.
Here the size of a cashe line( which is always same as size of a main memmory block) is 8 bytes. which is 23 bytes. So you need 3 bits to represent a byte within a cashe line. Now you have 8 bits (11 - 3) are remaining in the address.
Now the total number of lines present in the cache is (cashe size / line size) = 26 / 23 = 23
So, you have 3 bits to represent the line in which the your required byte is present.
The number of remaining bits now are 5 (8 - 3).
These 5 bits can be used to represent a tag. :)
3) 3 bit for index. If you were trying to label the number of bits needed to represent a line as index. Yes you are right.
4) 3 bits will be used to access a byte withing a cache line. (8 = 23)
So,
11 bits total address length = 5 tag bits + 3 bits to represent a line + 3 bits to represent a byte(word) withing a line
Hope there is no confusion now.
Given the hexadecimal bytes 0x12345678, copy the bytes to memory using big-endian order.
Address Content
0x00400003 0x78
0x00400002 0x56
0x00400001 0x34
0x00400000 0x12
Is that right?
In big-endian, the most significant byte (12) should come first, and then the rest should come in decreasing order of significance.
If the given number is in big-endian byte-order (and probably it is), your solution is right, as it will look like this:
00400000|00400001|00400002|00400003
--------+--------+--------+--------
12 | 34 | 56 | 78
If you had to arrange the bytes in little endian, the arrangement would be reversed:
00400000|00400001|00400002|00400003
--------+--------+--------+--------
78 | 56 | 34 | 12
Note that in this arrangement, only the order of bytes is reversed, but the order of nibbles (4-bit regions = hexadecimal digits) remains the same.
You can read more in this Wikipedia page about endianness.
I have an gray scale image of sixe 512x512. Thus, each pixel contains 8 bits. Can I embed a total of 8 bits into the pixels I wish to embed data in? Is this possible? (I require the image only for embedding data). In case I want to embed data in 10,000 pixels out of the total 512*512 pixels, can I then in total embed 80,000 bits of data or 10kB of data?
A standard grayscale image with 256 levels for each pixel requires 8 bits per pixel. This is because 8 bits are required to encode 256 different levels. If you have an image with dimensions 512 x 512 then the total number of pixels in the entire image is 262,144 pixels. So, the entire image contains 8 bits * 262,144 = 2,097,152 bits worth of information.
If you were to take a subset of these pixels and encode 8 bits of "different" information, note that the resulting image would likely change in appearance. The 8 bits of information at each pixel coordinate previously encoded the pixel intensity (from 0 to 255). If you are replacing this value with some other value then the intensity will be different and the overall image will appear different.
If you want to embed 10KiB of data in a 512x512 image, where the bit depth is 8 bits, I'd recommend just storing 1 bit of data in every second pixel by changing the LSB of each.
Changing just 1 bit of data from every other pixel allows you to store (512*512*1)/2 bits of data, or 16KiB of data. This way you can store all of the data that you need to while only changing the image in a very limited way.
As an example, here's an image with varying amounts of white-noise embedded within it (by embedding n bytes per pixel), you can see how much noise(data) is embedded in the table below:
X | Y | bits used | data(KiB)
0 | 0 | 0 | 0
1 | 0 | 1 | 32
0 | 1 | 2 | 64
1 | 1 | 3 | 96
0 | 2 | 4 | 128
1 | 2 | 5 | 160
0 | 3 | 6 | 192
1 | 3 | 7 | 224
_ | _ | 8 | 256 (image omitted as just white noise)
As can be seen, embedding up to 64KiB of data into a 512x512x8 image is perfectly reasonable expecting little noticeable change in the image by editing the 2 LSB of each pixel, so that a pixel is encoded as:
XXXX XXYY
Where X came from the original image, and Y is 2 bits of the stored data.
I have this question in an Operating System test:
Given a disk of 1GB with 16KB blocks:
(1) Calculate the size of the File Allocation Table:
My Answer: since there are 2^16 blocks in the disk, we have a table with 2^16 entry, and every entry needs to store 16 bit (since there are 2^16 different blocks, we need 16 bit to identify each of them). So the size is 2^16 times 16 bit = 2^16 x 2^4 = 2^20 bit = 2^17 byte = 128Kb.
(2) Given the following table, indicate in which block are stored the following byte:
-byte 131080 of FileA starting at block 4.
-byte 62230 of FileB starting at block 3.
Entry Content
0 10
1 2
2 0
3 6
4 1
5 8
6 7
7 11
8 12
So FileA is (4) -> (1) -> (2) but the problem is: since every block is 16Kb = 2^4 x 2^10 byte = 2^14 byte = 16384 byte, block 4 contains from 1 to 16384, block 1 contains from 16385 to 32768, and block 2 from 32769 to 49152, where am I supposed to find the byte 131080???
Where is this wrong??
What is the difference between the following types of endianness?
byte (8b) invariant big and little endianness
half-word (16b) invariant big and little endianness
word (32b) invariant big and little endianness
double-word (64b) invariant big and little endianness
Are there other types/variations?
There are two approaches to endian mapping: address invariance and data invariance.
Address Invariance
In this type of mapping, the address of bytes is always preserved between big and little. This has the side effect of reversing the order of significance (most significant to least significant) of a particular datum (e.g. 2 or 4 byte word) and therefore the interpretation of data. Specifically, in little-endian, the interpretation of data is least-significant to most-significant bytes whilst in big-endian, the interpretation is most-significant to least-significant. In both cases, the set of bytes accessed remains the same.
Example
Address invariance (also known as byte invariance): the byte address is constant but byte significance is reversed.
Addr Memory
7 0
| | (LE) (BE)
|----|
+0 | aa | lsb msb
|----|
+1 | bb | : :
|----|
+2 | cc | : :
|----|
+3 | dd | msb lsb
|----|
| |
At Addr=0: Little-endian Big-endian
Read 1 byte: 0xaa 0xaa (preserved)
Read 2 bytes: 0xbbaa 0xaabb
Read 4 bytes: 0xddccbbaa 0xaabbccdd
Data Invariance
In this type of mapping, the relative byte significance is preserved for datum of a particular size. There are therefore different types of data invariant endian mappings for different datum sizes. For example, a 32-bit word invariant endian mapping would be used for a datum size of 32. The effect of preserving the value of particular sized datum, is that the byte addresses of bytes within the datum are reversed between big and little endian mappings.
Example
32-bit data invariance (also known as word invariance): The datum is a 32-bit word which always has the value 0xddccbbaa, independent of endianness. However, for accesses smaller than a word, the address of the bytes are reversed between big and little endian mappings.
Addr Memory
| +3 +2 +1 +0 | <- LE
|-------------------|
+0 msb | dd | cc | bb | aa | lsb
|-------------------|
+4 msb | 99 | 88 | 77 | 66 | lsb
|-------------------|
BE -> | +0 +1 +2 +3 |
At Addr=0: Little-endian Big-endian
Read 1 byte: 0xaa 0xdd
Read 2 bytes: 0xbbaa 0xddcc
Read 4 bytes: 0xddccbbaa 0xddccbbaa (preserved)
Read 8 bytes: 0x99887766ddccbbaa 0x99887766ddccbbaa (preserved)
Example
16-bit data invariance (also known as half-word invariance): The datum is a 16-bit
which always has the value 0xbbaa, independent of endianness. However, for accesses smaller than a half-word, the address of the bytes are reversed between big and little endian mappings.
Addr Memory
| +1 +0 | <- LE
|---------|
+0 msb | bb | aa | lsb
|---------|
+2 msb | dd | cc | lsb
|---------|
+4 msb | 77 | 66 | lsb
|---------|
+6 msb | 99 | 88 | lsb
|---------|
BE -> | +0 +1 |
At Addr=0: Little-endian Big-endian
Read 1 byte: 0xaa 0xbb
Read 2 bytes: 0xbbaa 0xbbaa (preserved)
Read 4 bytes: 0xddccbbaa 0xddccbbaa (preserved)
Read 8 bytes: 0x99887766ddccbbaa 0x99887766ddccbbaa (preserved)
Example
64-bit data invariance (also known as double-word invariance): The datum is a 64-bit
word which always has the value 0x99887766ddccbbaa, independent of endianness. However, for accesses smaller than a double-word, the address of the bytes are reversed between big and little endian mappings.
Addr Memory
| +7 +6 +5 +4 +3 +2 +1 +0 | <- LE
|---------------------------------------|
+0 msb | 99 | 88 | 77 | 66 | dd | cc | bb | aa | lsb
|---------------------------------------|
BE -> | +0 +1 +2 +3 +4 +5 +6 +7 |
At Addr=0: Little-endian Big-endian
Read 1 byte: 0xaa 0x99
Read 2 bytes: 0xbbaa 0x9988
Read 4 bytes: 0xddccbbaa 0x99887766
Read 8 bytes: 0x99887766ddccbbaa 0x99887766ddccbbaa (preserved)
There's also middle or mixed endian. See wikipedia for details.
The only time I had to worry about this was when writing some networking code in C. Networking typically uses big-endian IIRC. Most languages either abstract the whole thing or offer libraries to guarantee that you're using the right endian-ness though.
Philibert said,
bits were actually inverted
I doubt any architecture would break byte value invariance. The order of bit-fields may need inversion when mapping structs containing them against data. Such direct mapping relies on compiler specifics which are outside the C99 standard but which may still be common. Direct mapping is faster but does not comply with the C99 standard that does not stipulate packing, alignment and byte order. C99-compliant code should use slow mapping based on values rather than addresses. That is, instead of doing this,
#if LITTLE_ENDIAN
struct breakdown_t {
int least_significant_bit: 1;
int middle_bits: 10;
int most_significant_bits: 21;
};
#elif BIG_ENDIAN
struct breakdown_t {
int most_significant_bits: 21;
int middle_bits: 10;
int least_significant_bit: 1;
};
#else
#error Huh
#endif
uint32_t data = ...;
struct breakdown_t *b = (struct breakdown_t *)&data;
one should write this (and this is how the compiler would generate code anyways even for the above "direct mapping"),
uint32_t data = ...;
uint32_t least_significant_bit = data & 0x00000001;
uint32_t middle_bits = (data >> 1) & 0x000003FF;
uint32_t most_significant_bits = (data >> 11) & 0x001fffff;
The reason behind the need to invert the order of bit-fields in each endian-neutral, application-specific data storage unit is that compilers pack bit-fields into bytes of growing addresses.
The "order of bits" in each byte does not matter as the only way to extract them is by applying masks of values and by shifting to the the least-significant-bit or most-significant-bit direction. The "order of bits" issue would only become important in imaginary architectures with the notion of bit addresses. I believe all existing architectures hide this notion in hardware and provide only least vs. most significant bit extraction which is the notion based on the endian-neutral byte values.
Best article I read about endianness "Understanding Big and Little Endian Byte Order".
Actually, I'd describe the endianness of a machine as the order of bytes inside of a word, and not the order of bits.
By "bytes" up there I mean the "smallest unit of memory the architecture can manage individually". So, if the smallest unit is 16 bits long (what in x86 would be called a word) then a 32 bit "word" representing the value 0xFFFF0000 could be stored like this:
FFFF 0000
or this:
0000 FFFF
in memory, depending on endianness.
So, if you have 8-bit endianness, it means that every word consisting of 16 bits, will be stored as:
FF 00
or:
00 FF
and so on.
Practically speaking, endianess refers to the way the processor will interpret the content of a given memory location. For example, if we have memory location 0x100 with the following content (hex bytes)
0x100: 12 34 56 78 90 ab cd ef
Reads Little Endian Big Endian
8-bit: 12 12
16-bit: 34 12 12 34
32-bit: 78 56 34 12 12 34 56 78
64-bit: ef cd ab 90 78 56 34 12 12 34 56 78 90 ab cd ef
The two situations where you need to mind endianess are with networking code and if you do down casting with pointers.
TCP/IP specifies that data on the wire should be big endian. If you transmit types other than byte arrays (like pointers to structures), you should make sure to use the ntoh/hton macros to ensure the data is sent big endian. If you send from a little-endian processor to a big-endian processor (or vice versa), the data will be garbled...
Casting issues:
uint32_t* lptr = 0x100;
uint16_t data;
*lptr = 0x0000FFFF
data = *((uint16_t*)lptr);
What will be the value of data?
On a big-endian system, it would be 0 On a little-endian system, it would be FFFF
13 years ago I worked on a tool portable to both a DEC ALPHA system and a PC. On this DEC ALPHA the bits were actually inverted. That is:
1010 0011
actually translated to
1100 0101
It was almost transparent and seamless in the C code except that I had a bitfield declared like
typedef struct {
int firstbit:1;
int middlebits:10;
int lastbits:21;
};
that needed to be translated to (using #ifdef conditional compiling)
typedef struct {
int lastbits:21;
int middlebits:10;
int firstbit:1;
};
As #erik-van-brakel answered on this post, be careful when communicating with certain PLC : Mixed-endian still alive !
Indeed, I need to communicate with a PLC (from a well known manufacturer) with (Modbus-TCP) OPC protocol and it seems that it returns me a mixed-endian on every half word. So it is still used by some of the larger manufacturers.
Here is an example with the "pieces" string :
the basic concept is the ordering of bits:
1010 0011
in little-endian is the same as
0011 1010
in big-endian (and vice-versa).
You'll notice the order changes by grouping, not by individual bit. I don't know of a system, for example, where
1100 0101
would be the "other-endian" version of the first version.