How is byte-ordering actually done in little-endian architectures when the data type size is bigger than the word size?

How is byte-ordering actually done in little-endian architectures when the data type size is bigger than the word size? - endianness

First I want to apologize because English is not my native language. I'm taking CS50 Introduction to Computer Science course and I've come across the concepts of 'Endianness' and 'Word size' and even though I think I've understood them pretty well, there's still some confusion.
As far as I know, 'Word size' refers to the number of bytes a processor can read or write from memory in one cycle, the instruction bytes it can send at a time, and the max size of memory addresses also; being them 4 bytes in 32-bits architectures and 8 bytes in 64-bits architectures. Correct me if I'm wrong with this.
Now, 'Endianness' refers to the ordering of bytes of a multi-byte data type (like an int or float, not a char) when the processor handles them, to either storage or transmit them. According to some definitions I've read, this concept is linked to a word size. For example, in Wikipedia, it says: "endianness is the ordering or sequencing of bytes of a word of digital data". Big-endian means when the most significant byte is placed in the smallest memory address and low-endian when the least significant byte is placed in the smallest memory address instead.
I've seen many examples and diagrams like this one:
Little-endian / Big-endian explanation diagram
I understand big-endian very well and little-endian when the data type being processed has a size equal or smaller than the word size is also clear. But what happens when it's bigger than the word size? Imagine an 8-byte data type in a 32-bit little-endian architecture (4-byte words), how are the bytes actually stored:
Ordering #1:
----------------->
lower address to higher address
b7 b6 b5 b4 |b3 b2 b1 b0
word 0 |word 1
Ordering #2:
----------------->
lower address to higher address
b3 b2 b1 b0 | b7 b6 b5 b4
word 0 | word 1
I've found mixed answers to this question, and I wanted to have this concept clear to continue. Thank you in advance!

Related

Question about memory space in microprocessor

My teacher has given me the question to differentiate the maximum memory space of 1MB and 4GB microprocessor. Does anyone know how to answer this question apart from size mentioned difference ?
https://i.stack.imgur.com/Q4Ih7.png

A 32-bit microprocessor can address up to 4 GB of memory, because its registers can contain an address that is 32 bits in size. (A 32-bit number ranges from 0 to 4,294,967,295‬). Each of those values can represent a unique memory location.
The 16-bit 8086, on the other hand, has 16-bit registers which only range from 0 to 65,535. However, the 8086 has a trick up its sleeve- it can use memory segments to increase this range up to one megabyte (20 bits). There are segment registers whose values are automatically bit-shifted left by 4 then added to the regular registers to form the final address.
For example, let's look at video mode 13h on the 8086. This is the 256-color VGA standard with a resolution of 320x200 pixels. Each pixel is represented by a single byte and the desired color is stored in that byte. The video memory is located at address 0xA0000, but since this value is greater than 16 bits, typically the programmer will load 0xA000 into a segment register like ds or es, then load 0000 into si or di. Once that is done, the program can read from [ds:si] and write to [es:di] to access the video memory. It's important to keep in mind that with this memory addressing scheme, not all combinations of segment and offset represent a unique memory location. Having es = A100/di = 0000 is the same as es=A000/di=1000.

CRC32 CRC peripheral on STM 32 : byte and word streams of same data give different results

I am using the STM32 ARM CRC peripheral and getting different CRC codes for the same data when fed in as bytes compared to when fed in as words.
Using the byte word length and a small word aligned data string:
const char *ts4 = "The quick brown fox jumped over the lazy brown dog."; // 52 CHARS divisible by 4;
This, with a buffer size of strlen(ts4) gives a CRC32 of ~ 0x01fba559
0xfe045aa6.
It was then configured the CRC for WORD size (setting buffer size to strlen(ts4)/4) and the DMA engine was pointed at the CRC data register. It gave a different CRC result, ~ 0xf2bd1910 0x0d42e6ef, so I called it, again for WORD size using the HAL_CALCULATE method (to ensure the DMA was working as expected). This again gave ~ 0xf2bd1910 0x0d42e6ef.
Does the CRC32 algorithm give different results for different word size inputs ? I don't really want to tie the DMA engine up transferring bytes. Is there an equivalent `C' function that calculates CRC32 with a 32 bit WORD input ? I have tried reversing the order of the bytes in the word but this does not solve it (I thought it might have been a big/little endian problem).

That's 51 characters, not 52. That length over 4 would give 12, not 13. The CRC of the first 48 characters would be expected to be different than the CRC of the 51 characters.
Also I'd think that you would need to assure that the string starts on a word boundary.

Why do bytes exist? Why don't we just use bits?

A byte consists of 8 bits on most systems.
A byte typically represents the smallest data type a programmer may use. Depending on language, the data types might be called char or byte.
There are some types of data (booleans, small integers, etc) that could be stored in fewer bits than a byte. Yet using less than a byte is not supported by any programming language I know of (natively).
Why does this minimum of using 8 bits to store data exist? Why do we even need bytes? Why don't computers just use increments of bits (1 or more bits) rather than increments of bytes (multiples of 8 bits)?
Just in case anyone asks: I'm not worried about it. I do not have any specific needs. I'm just curious.

because at the hardware level memory is naturally organized into addressable chunks. Small chunks means that you can have fine grained things like 4 bit numbers; large chunks allow for more efficient operation (typically a CPU moves things around in 'chunks' or multiple thereof). IN particular larger addressable chunks make for bigger address spaces. If I have chunks that are 1 bit then an address range of 1 - 500 only covers 500 bits whereas 500 8 bit chunks cover 4000 bits.
Note - it was not always 8 bits. I worked on a machine that thought in 6 bits. (good old octal)

Paper tape (~1950's) was 5 or 6 holes (bits) wide, maybe other widths.
Punched cards (the newer kind) were 12 rows of 80 columns.
1960s:
B-5000 - 48-bit "words" with 6-bit characters
CDC-6600 -- 60-bit words with 6-bit characters
IBM 7090 -- 36-bit words with 6-bit characters
There were 12-bit machines; etc.
1970-1980s, "micros" enter the picture:
Intel 4004 - 4-bit chunks
8008, 8086, Z80, 6502, etc - 8 bit chunks
68000 - 16-bit words, but still 8-bit bytes
486 - 32-bit words, but still 8-bit bytes
today - 64-bit words, but still 8-bit bytes
future - 128, etc, but still 8-bit bytes
Get the picture? Americans figured that characters could be stored in only 6 bits.
Then we discovered that there was more in the world than just English.
So we floundered around with 7-bit ascii and 8-bit EBCDIC.
Eventually, we decided that 8 bits was good enough for all the characters we would ever need. ("We" were not Chinese.)
The IBM-360 came out as the dominant machine in the '60s-70's; it was based on an 8-bit byte. (It sort of had 32-bit words, but that became less important than the all-mighty byte.
It seemed such a waste to use 8 bits when all you really needed 7 bits to store all the characters you ever needed.
IBM, in the mid-20th century "owned" the computer market with 70% of the hardware and software sales. With the 360 being their main machine, 8-bit bytes was the thing for all the competitors to copy.
Eventually, we realized that other languages existed and came up with Unicode/utf8 and its variants. But that's another story.

Good way for me to write something late on night!
Your points are perfectly valid, however, history will always be that insane intruder how would have ruined your plans long before you were born.
For the purposes of explanation, let's imagine a ficticious machine with an architecture of the name of Bitel(TM) Inside or something of the like. The Bitel specifications mandate that the Central Processing Unit (CPU, i.e, microprocessor) shall access memory in one-bit units. Now, let's say a given instance of a Bitel-operated machine has a memory unit holding 32 billion bits (our ficticious equivalent of a 4GB RAM unit).
Now, let's see why Bitel, Inc. got into bankruptcy:
The binary code of any given program would be gigantic (the compiler would have to manipulate every single bit!)
32-bit addresses would be (even more) limited to hold just 512MB of memory. 64-bit systems would be safe (for now...)
Memory accesses would be literally a deadlock. When the CPU has got all of those 48 bits it needs to process a single ADD instruction, the floppy would have already spinned for too long, and you know what happens next...
Who the **** really needs to optimize a single bit? (See previous bankruptcy justification).
If you need to handle single bits, learn to use bitwise operators!
Programmers would go crazy as both coffee and RAM get too expensive. At the moment, this is a perfect synonym of apocalypse.
The C standard is holy and sacred, and it mandates that the minimum addressable unit (i.e, char) shall be at least 8 bits wide.
8 is a perfect power of 2. (1 is another one, but meh...)

In my opinion, it's an issue of addressing. To access individual bits of data, you would need eight times as many addresses (adding 3 bits to each address) compared to using accessing individual bytes. The byte is generally going to be the smallest practical unit to hold a number in a program (with only 256 possible values).

Some CPUs use words to address memory instead of bytes. That's their natural data type, so 16 or 32 bits. If Intel CPUs did that it would be 64 bits.
8 bit bytes are traditional because the first popular home computers used 8 bits. 256 values are enough to do a lot of useful things, while 16 (4 bits) are not quite enough.
And, once a thing goes on for long enough it becomes terribly hard to change. This is also why your hard drive or SSD likely still pretends to use 512 byte blocks. Even though the disk hardware does not use a 512 byte block and the OS doesn't either. (Advanced Format drives have a software switch to disable 512 byte emulation but generally only servers with RAID controllers turn it off.)
Also, Intel/AMD CPUs have so much extra silicon doing so much extra decoding work that the slight difference in 8 bit vs 64 bit addressing does not add any noticeable overhead. The CPU's memory controller is certainly not using 8 bits. It pulls data into cache in long streams and the minimum size is the cache line, often 64 bytes aka 512 bits. Often RAM hardware is slow to start but fast to stream so the CPU reads kilobytes into L3 cache, much like how hard drives read an entire track into their caches because the drive head is already there so why not?

First of all, C and C++ do have native support for bit-fields.
#include <iostream>
struct S {
// will usually occupy 2 bytes:
// 3 bits: value of b1
// 2 bits: unused
// 6 bits: value of b2
// 2 bits: value of b3
// 3 bits: unused
unsigned char b1 : 3, : 2, b2 : 6, b3 : 2;
};
int main()
{
std::cout << sizeof(S) << '\n'; // usually prints 2
}
Probably an answer lies in performance and memory alignment, and the fact that (I reckon partly because byte is called char in C) byte is the smallest part of machine word that can hold a 7-bit ASCII. Text operations are common, so special type for plain text have its gain for programming language.

Why bytes?
What is so special about 8 bits that it deserves its own name?
Computers do process all data as bits, but they prefer to process bits in byte-sized groupings. Or to put it another way: a byte is how much a computer likes to "bite" at once.
The byte is also the smallest addressable unit of memory in most modern computers. A computer with byte-addressable memory can not store an individual piece of data that is smaller than a byte.
What's in a byte?
A byte represents different types of information depending on the context. It might represent a number, a letter, or a program instruction. It might even represent part of an audio recording or a pixel in an image.
Source

Understanding disassembler: See how many bytes are used for add

I disassembled a program (with objdump -d a.out) and now I would like understand what the different sections in a line like
400586: 48 83 c4 08 add $0x8,%rsp
stand for. More specifically I would like to know how you can see how many bytes are used for adding two registers. My idea was that the 0x8 in add $0x8,%rsp, which is 8 in decimal gives me 2 * 4 so 2 bytes for adding 2 registers. Is that correct?
PS: compiler is gcc, OS is suse linux

In the second column you see 48 83 c4 08. Every two-digit hex-number stands for one byte, so the amount of bytes is four. The last 08 correlates to $0x8, the other three bytes are the machine code for "add an 8-bit constant to RSP" (for pedantic editors: Intel writes its registers upper case). It's quite difficult to deconstruct the machine code, but your assumption is completely wrong.

What determines an architectures byte size?

Am I correct in saying that if I construct a RAM of x storage locations, each of which is y-bits wide, then I have xybits of y-bit RAM?
Questions such as this one explain with historical examples why we cannot rely on 8b == 1B, but I cannot find confirmation of what this means in terms of architecture.

Slightly older, and slightly wiser, I think I can answer my own question.
Given an N-bit address bus, there are 2^N addressable memory locations.
If an M-bit data bus is desired, then log M bits address columns, and N - log M address rows into the RAM.
So back to the question; N := x, M := y, and we have y*2^x bits of y-bit RAM.
For numerical example, suppose a 12-bit address and 16-bit data. 8 bits address 2^8 = 256 rows; the remaining 4 bits address 16x 16:1 multiplexers on the columns, giving 16 bits of data output. (As shown above).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio