Parallel Crc-32 calculation Ethernet 10GE MAC - vhdl

I have generated an Ethernet 10GE MAC design in VHDL. Now I am trying to implement CRC. I have a 64-bit parallel CRC-32 generator in VHDL.
Specification:
- Data bus is of 64-bits
- Control bus is of 8-bits (which validates the data bytes)
Issue:
Let's say, my incoming packet length is 14-bytes, (assuming no padding).
The CRC is calculated for the first 8 bytes in one clock cycle, but when I try to calculate the CRC over the remaining 6 bytes the results are wrong due to zeros being appended.
Is there a way I can generate the CRC for any length of bytes packet length using a 64-bit parallel CRC generator?
What I've tried:
I used different parallel CRC generators (8-bit parallel CRC, 16-bit parallel CRC generator and so on). But that consumes a lot of FPGA resources. I want to conserve resources using just 64-bit parallel CRC generators.

Start with a constant 64-bit data word that brings the effective CRC register to all zeros. Then prepend the message with zero bytes, instead of appending them, putting those zeros on the end of the 64-bit word that is processed first. (You did not provide the CRC definition, so this depends on whether the CRC is reflected or not. If the CRC is reflected, then put the zero bytes in the least-significant bit positions. If the CRC is not reflected, then put them in the most-significant bit positions.) Then exclusive-or the result with a 32-bit constant.
So for the example, you would first feed a 64-bit constant to the parallel CRC generator, then feed two zero bytes and six bytes of message in the first word, and then eight message bytes in the second word. Then exclusive-or the result with the 32-bit constant.
For the standard PKZIP CRC, the 64-bit constant is 0x00000000ffffffff, the 32-bit constant is 0x2e448638, and the prepended zero bytes go in the bottom of the 64-bit word.
If you are in control of the implementation of the CRC generator, then you can probably modify it to initialize the effective CRC register to all zeros when you reset the generator, avoiding the need to feed the 64-bit constant.

I can't speak for certain, but if you can pad zeros at the start of your packet instead of at the end, then you should get the right answer. It does depend on the polynomial and the initializer...
See this answer here Best way to generate CRC8/16 when input is odd number of BITS (not byte)? C or Python

Related

which one is more complex? calculating a 64 bits CRC or two 32 bits CRCs with different polynomials?

I was wondering how 64-bit CRC on an FPGA compares to two 32-bit CRCs (different polynomials) on the same FPGA. Would two 32 bits CRC be more complicated than performing a single 64-bit CRC? Is it going to take a while or it would be fast?
How can I calculate the complexity (or do a complexity analysis)?
any help would be much appreciated
Thank you.
I was wondering how 64-bit CRC on an FPGA compares to two 32-bit CRCs (different polynomials) on the same FPGA.
On a "normal" FPGA it does not matter which kind of information (CRCs, checksums, floating-point values ...) you compare:
Checking if a 64-bit value equals another 64-bit value takes the same amount of resources (gates or time).
This is of course not true if you use an FPGA that has a built-in CRC unit that (as an example) supports CRC32 but not CRC64 ...
Would two 32 bits CRC be more complicated than performing a single 64-bit CRC?
In both cases you'll need 64 logic cells (this means: 64 LUTs and 64 flip-flops).
In the case of a 64-bit CRC, 63 logic cells must be connected to the previous logic cell and there must be one signal line connecting the first and the last logic cell.
In the case of two 32-bit CRCs, 62 logic cells must be connected to the previous logic cell and there must be two signal lines connecting the first and the last logic cell of each CRC.
If you have an FPGA that allows connecting 64 cells in a row without using a "long" signal line, the 64-bit CRC saves one "long" signal line.
(Edit: On the FPGA on my eval board you can connect 16 cells in a row; on such an FPGA, both onw 64-bit CRC and two 32-bit CRC would cost 5 "long" signal lines.)
Is it going to take a while or it would be fast?
How can I calculate the complexity (or do a complexity analysis)?
You require one clock cycle per bit - in both cases.
Note that an FPGA works completely differently than a computer:
You typically don't need time to perform some operation but all operations are performed at the same time...

CRC32 CRC peripheral on STM 32 : byte and word streams of same data give different results

I am using the STM32 ARM CRC peripheral and getting different CRC codes for the same data when fed in as bytes compared to when fed in as words.
Using the byte word length and a small word aligned data string:
const char *ts4 = "The quick brown fox jumped over the lazy brown dog."; // 52 CHARS divisible by 4;
This, with a buffer size of strlen(ts4) gives a CRC32 of ~ 0x01fba559
0xfe045aa6.
It was then configured the CRC for WORD size (setting buffer size to strlen(ts4)/4) and the DMA engine was pointed at the CRC data register. It gave a different CRC result, ~ 0xf2bd1910 0x0d42e6ef, so I called it, again for WORD size using the HAL_CALCULATE method (to ensure the DMA was working as expected). This again gave ~ 0xf2bd1910 0x0d42e6ef.
Does the CRC32 algorithm give different results for different word size inputs ? I don't really want to tie the DMA engine up transferring bytes. Is there an equivalent `C' function that calculates CRC32 with a 32 bit WORD input ? I have tried reversing the order of the bytes in the word but this does not solve it (I thought it might have been a big/little endian problem).
That's 51 characters, not 52. That length over 4 would give 12, not 13. The CRC of the first 48 characters would be expected to be different than the CRC of the 51 characters.
Also I'd think that you would need to assure that the string starts on a word boundary.

Significance of Bytes as 8 bits

I was just wondering the reason why A BYTE IS 8 BITS ? Specifically if we talk about ASCII character set, then all its symbols can be represented just 7 bits leaving one spare bit(in reality where 8 bits is 1 Byte). So if we assume, that that there is big company wherein everyone has agreed to just use ASCII character set and nothing else(also this company doesn't have to do anything with the outside world) then couldn't in this company the developers develop softwares that would consider 7 Bits as 1 Byte and hence save one precious bit, and if done so they could save for instance 10 bits space for every 10 bytes(here 1 byte is 7 bits again) and so, ultimately lots and lots of precious space. The hardware(hard disk,processor,memory) used in this company specifically knows that it need to store & and bunch together 7 bits as 1 byte.If this is done globally then couldn't this revolutionise the future of computers. Can this system be developed in reality ?
Won't this be efficient ?
A byte is not necessarily 8 bits. A byte a unit of digital information whose size is processor-dependent. Historically, the size of a byte is equal to the size of a character as specified by the character encoding supported by the processor. For example, a processor that supports Binary-Coded Decimal (BCD) characters defines a byte to be 4 bits. A processor that supports ASCII defines a byte to be 7 bits. The reason for using the character size to define the size of a byte is to make programming easier, considering that a byte has always (as far as I know) been used as the smallest addressable unit of data storage. If you think about it, you'll find that this is indeed very convenient.
A byte is defined to be 8 bits in the extremely successful IBM S/360 computer family, which used an 8-bit character encoding called EBCDI. IBM, through its S/360 computers, introduced several crucially important computing techniques that became the foundation of all future processors including the ones we using today. In fact, the term byte has been coined by Buchholz, a computer scientist at IBM.
When Intel introduced its first 8-bit processor (8008), a byte was defined to be 8 bits even though the instruction set didn't support directly any character encoding, thereby breaking the pattern. The processor, however, provided numerous instructions that operate on packed (4-bit) and unpacked (8-bit) BCD-encoded digits. In fact, the whole x86 instruction set design was conveniently designed based on 8-bit bytes. The fact that 7-bit ASCII characters fit in 8-bit bytes was a free, additional advantage. As usual, a byte is the smallest addressable unit of storage. I would like to mention here that in digital circuit design, its convenient to have the number of wires or pins to be powers of 2 so that every possible value that appear as input or output has a use.
Later processors continued to use 8-bit bytes because it makes it much easier to develop newer designs based on older ones. It also helps making newer processors compatible with older ones. Therefore, instead of changing the size of a byte, the register, data bus, address bus sizes were doubled every time (now we reached 64-bit). This doubling enabled us to use existing digital circuit designs easily, significantly reducing processor design costs.
The main reason why it's 8 bits and not 7 is that is needs to be a power of 2.
Also: imagine what nibbles would look like in 7-bit bytes..
Also ideal (and fast) for conversion to and from hexadecimal.
Update:
What advantage do we get if we have power of 2... Please explain
First, let's distinguish between a BYTE and a ASCII character. Those are 2 different things.
A byte is used to store and process digital information (numbers) in a optimized way, whereas a character is (or should be) only meant to interact with us, humans, because we find it hard to read binary (although in modern days of big-data, big-internetspeed and big-clouds, even servers start talking to each other in text (xml, json), but that's a whole different story..).
As for a byte being a power of 2, the short answer:
The advantage of having powers of 2, is that data can easily be aligned efficiently on byte- or integer-boundaries - for a single byte that would be 1, 2, 4 and 8 bits, and it gets better with higher powers of 2.
Compare that to a 7-bit ASCII (or 7-bit byte): 7 is a prime number, which means only 1-bit and 7-bit values could be stored in an aligned form.
Of course there are a lot more reasons one could think of (for example the lay-out and structure of the logic gates and multiplexers inside CPU's/MCU's).
Say you want to control the in- or output pins on a multiplexer: with 2 control-lines (bits) you can address 4 pins, with 3 inputs, 8 pins can be addressed, with 4 -> 16,.. - idem for address-lines. So the more you look at it, the more sense it makes to use powers of 2. It seems to be the most efficient model.
As for optimized 7-bit ASCII:
Even on a system with 8-bit bytes, 7-bit ASCII can easily be compacted with some bit-shifting. A Class with a operator[] could be created, without the need to have 7-bit bytes (and of course, a simple compression would even do better).

Why do bytes exist? Why don't we just use bits?

A byte consists of 8 bits on most systems.
A byte typically represents the smallest data type a programmer may use. Depending on language, the data types might be called char or byte.
There are some types of data (booleans, small integers, etc) that could be stored in fewer bits than a byte. Yet using less than a byte is not supported by any programming language I know of (natively).
Why does this minimum of using 8 bits to store data exist? Why do we even need bytes? Why don't computers just use increments of bits (1 or more bits) rather than increments of bytes (multiples of 8 bits)?
Just in case anyone asks: I'm not worried about it. I do not have any specific needs. I'm just curious.
because at the hardware level memory is naturally organized into addressable chunks. Small chunks means that you can have fine grained things like 4 bit numbers; large chunks allow for more efficient operation (typically a CPU moves things around in 'chunks' or multiple thereof). IN particular larger addressable chunks make for bigger address spaces. If I have chunks that are 1 bit then an address range of 1 - 500 only covers 500 bits whereas 500 8 bit chunks cover 4000 bits.
Note - it was not always 8 bits. I worked on a machine that thought in 6 bits. (good old octal)
Paper tape (~1950's) was 5 or 6 holes (bits) wide, maybe other widths.
Punched cards (the newer kind) were 12 rows of 80 columns.
1960s:
B-5000 - 48-bit "words" with 6-bit characters
CDC-6600 -- 60-bit words with 6-bit characters
IBM 7090 -- 36-bit words with 6-bit characters
There were 12-bit machines; etc.
1970-1980s, "micros" enter the picture:
Intel 4004 - 4-bit chunks
8008, 8086, Z80, 6502, etc - 8 bit chunks
68000 - 16-bit words, but still 8-bit bytes
486 - 32-bit words, but still 8-bit bytes
today - 64-bit words, but still 8-bit bytes
future - 128, etc, but still 8-bit bytes
Get the picture? Americans figured that characters could be stored in only 6 bits.
Then we discovered that there was more in the world than just English.
So we floundered around with 7-bit ascii and 8-bit EBCDIC.
Eventually, we decided that 8 bits was good enough for all the characters we would ever need. ("We" were not Chinese.)
The IBM-360 came out as the dominant machine in the '60s-70's; it was based on an 8-bit byte. (It sort of had 32-bit words, but that became less important than the all-mighty byte.
It seemed such a waste to use 8 bits when all you really needed 7 bits to store all the characters you ever needed.
IBM, in the mid-20th century "owned" the computer market with 70% of the hardware and software sales. With the 360 being their main machine, 8-bit bytes was the thing for all the competitors to copy.
Eventually, we realized that other languages existed and came up with Unicode/utf8 and its variants. But that's another story.
Good way for me to write something late on night!
Your points are perfectly valid, however, history will always be that insane intruder how would have ruined your plans long before you were born.
For the purposes of explanation, let's imagine a ficticious machine with an architecture of the name of Bitel(TM) Inside or something of the like. The Bitel specifications mandate that the Central Processing Unit (CPU, i.e, microprocessor) shall access memory in one-bit units. Now, let's say a given instance of a Bitel-operated machine has a memory unit holding 32 billion bits (our ficticious equivalent of a 4GB RAM unit).
Now, let's see why Bitel, Inc. got into bankruptcy:
The binary code of any given program would be gigantic (the compiler would have to manipulate every single bit!)
32-bit addresses would be (even more) limited to hold just 512MB of memory. 64-bit systems would be safe (for now...)
Memory accesses would be literally a deadlock. When the CPU has got all of those 48 bits it needs to process a single ADD instruction, the floppy would have already spinned for too long, and you know what happens next...
Who the **** really needs to optimize a single bit? (See previous bankruptcy justification).
If you need to handle single bits, learn to use bitwise operators!
Programmers would go crazy as both coffee and RAM get too expensive. At the moment, this is a perfect synonym of apocalypse.
The C standard is holy and sacred, and it mandates that the minimum addressable unit (i.e, char) shall be at least 8 bits wide.
8 is a perfect power of 2. (1 is another one, but meh...)
In my opinion, it's an issue of addressing. To access individual bits of data, you would need eight times as many addresses (adding 3 bits to each address) compared to using accessing individual bytes. The byte is generally going to be the smallest practical unit to hold a number in a program (with only 256 possible values).
Some CPUs use words to address memory instead of bytes. That's their natural data type, so 16 or 32 bits. If Intel CPUs did that it would be 64 bits.
8 bit bytes are traditional because the first popular home computers used 8 bits. 256 values are enough to do a lot of useful things, while 16 (4 bits) are not quite enough.
And, once a thing goes on for long enough it becomes terribly hard to change. This is also why your hard drive or SSD likely still pretends to use 512 byte blocks. Even though the disk hardware does not use a 512 byte block and the OS doesn't either. (Advanced Format drives have a software switch to disable 512 byte emulation but generally only servers with RAID controllers turn it off.)
Also, Intel/AMD CPUs have so much extra silicon doing so much extra decoding work that the slight difference in 8 bit vs 64 bit addressing does not add any noticeable overhead. The CPU's memory controller is certainly not using 8 bits. It pulls data into cache in long streams and the minimum size is the cache line, often 64 bytes aka 512 bits. Often RAM hardware is slow to start but fast to stream so the CPU reads kilobytes into L3 cache, much like how hard drives read an entire track into their caches because the drive head is already there so why not?
First of all, C and C++ do have native support for bit-fields.
#include <iostream>
struct S {
// will usually occupy 2 bytes:
// 3 bits: value of b1
// 2 bits: unused
// 6 bits: value of b2
// 2 bits: value of b3
// 3 bits: unused
unsigned char b1 : 3, : 2, b2 : 6, b3 : 2;
};
int main()
{
std::cout << sizeof(S) << '\n'; // usually prints 2
}
Probably an answer lies in performance and memory alignment, and the fact that (I reckon partly because byte is called char in C) byte is the smallest part of machine word that can hold a 7-bit ASCII. Text operations are common, so special type for plain text have its gain for programming language.
Why bytes?
What is so special about 8 bits that it deserves its own name?
Computers do process all data as bits, but they prefer to process bits in byte-sized groupings. Or to put it another way: a byte is how much a computer likes to "bite" at once.
The byte is also the smallest addressable unit of memory in most modern computers. A computer with byte-addressable memory can not store an individual piece of data that is smaller than a byte.
What's in a byte?
A byte represents different types of information depending on the context. It might represent a number, a letter, or a program instruction. It might even represent part of an audio recording or a pixel in an image.
Source

Where will the Intermediate result be stored during segmentation?

In 8086, 20-bit address is generated with two 16-bit registers by using segmentation. The first 16bit address is multiplied by 10 and the result will be added to second 16bit address.
when multiplied by 10 it will generate a 5digit HEXA number which is 20bit long.
where will this intermediate result of 20bit (obtained when multiplied by 10) be stored?
Nowhere. The address appears on the corresponding pins (AD0-AD19) of the chip when it needs to access memory. The calculation is performed internally with dedicated logic. For example there is no actual multiplication by 0x10: the segment bits are directly paired to higher-numbered bits of the offset in corresponding adder (item 3 here).

Resources