MSI and MSI-X vectors support - linux-kernel

Where exactly can I find in the Linux kernel code the limit set for MSI and MSI-X supporting 32 vectors and 2048 vectors respectively ?

The limits to which you are referring are actually from the PCI standard. See, for example, this freely available briefing on MSI:
http://www.pcisig.com/developers/main/training_materials/get_document?doc_id=1c17cc8e96e3c1969ef8969569648e10d65d7e4d
In the kernel itself, there's some sanity checking in the MSI source code, but it looks like the maximum number of vectors is pulled from the PCI config space of the device, which should never return more than 32 (2048):
http://lxr.free-electrons.com/source/drivers/pci/msi.c?a=sh#L811

Related

A instruction control bits

I'm building a CPU in the nand2tetris course and I'm kind of stuck.
Do I have to check if the instruction is an A or C instruction?
In the A instruction guide it only shows the first control bit. The MSB controls the output of the first Mux. What controls the load of the A register if it's an A instruction?
If it's an A instruction the load of the A register should always be 1 im pretty sure.
If it's a C instruction there are lots of control bits but I can't use the same control bits for A instructions.
So should I be checking if the instruction coming in is a C or A instruction and then setting the control bits accordingly?
Here's a another picture that might be useful.
Here's one way to think about it:
In a C-instruction, there are a lot of bits (accccccdddjjj, plus the most significant bit that says it's a C-instruction) that determine what the various parts of the machine do. So if you are presented with a C-instruction, you just have to route those bits as appropriate to control the machine.
The A-instruction, on the other hand, doesn't have the control bits, it just has the instruction type bit. So in this instance, you have to generate the control bits so that there are no changes to the machine state other than storing the 15 bits into the A register (and incrementing the PC, of course).
You have to do something similar to handle Reset.

How can Windows split its virtual memory space asymmetrically?

According to the AMD64 Architecture Programmer's Manual Volume 2 (system programming), a logical address is valid only if the bits 48-63 are all the same as bit 47:
5.3.1 Canonical Address Form
The AMD64 architecture requires implementations supporting fewer than the full 64-bit virtual address to ensure that those addresses are in canonical form. An address is in canonical form if the address bits from the most-significant implemented bit up to bit 63 are all ones or all zeros. If the addresses of all bytes in a virtual-memory reference are not in canonical form, the processor generates a general-protection exception (#GP) or a stack fault (#SS) as appropriate.
So it seems the only valid address ranges are 0x0000_0000_0000_0000 ~ 0x0000_7FFF_FFFF_FFFF and 0xFFFF_8000_0000_0000 ~ 0xFFFF_FFFF_FFFF_FFFF, that is, the lower 128 TiB and higher 128 TiB. However, according to MSDN, the addresses used by Windows x64 kernel don't seem to be the case.
In 64-bit Windows, the theoretical amount of virtual address space is 2^64 bytes (16 exabytes), but only a small portion of the 16-exabyte range is actually used. The 8-terabyte range from 0x000'00000000 through 0x7FF'FFFFFFFF is used for user space, and portions of the 248-terabyte range from 0xFFFF0800'00000000 through 0xFFFFFFFF'FFFFFFFF are used for system space.
So, how can Windows split the virtual address space into lower 8 TiB and higher 248 TiB, despite the hardware specification? I'd like to know why it doesn't cause any problems with the hardware that checks whether the addresses are canonical.
**UPDATE: ** Seems like Microsoft fixed this discrepancy in Windows 8.1. See https://www.facebook.com/codemachineinc/posts/567137303353192 for details.
You're right; current x86-64 hardware with 48-bit virtual address support requires that the high 16 bits be the sign-extension of the low 48 (i.e. bit 47 matches bits [63:48]). That means about half of the 0xFFFF0800'00000000 to 0xFFFFFFFF'FFFFFFFF range is non-canonical on current x86-64 hardware.
Windows is just describing how it carves up the full 64-bit virtual address space, not which parts of that are actually in use on current x86-64 hardware. It can of course only use the 128 TiB that is canonical, from 0xFFFF8000'00000000 to -1. (Note the position of the 8; there's no gap between it and the high 16 bytes that are all-ones, unlike in the theoretical Windows range.)
Top-end servers can be built with 6TiB of RAM or maybe even more. (Xeon Platinum Scalable Processors are apparently available with up to 1.5TiB per socket, and up to 8-way, e.g. the 8180M).
Intel has proposed an extension for larger physical and virtual addressing that adds another level of page tables, https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf, so OSes will hopefully not be stuck without enough virtual address space to map all the RAM (like in the bad old days of PAE on 32-bit-only systems) before we have systems that have more than 128TiB of physical RAM.

Why is the register length static in any CPU

Why is the register length (in bits) that a CPU operates on not dynamically/manually/arbitrarily adjustable? Would it make the computer slower if it was adjustable this way?
Imagine you had an 8-bit integer. If you could adjust the CPU register length to 8 bits, the CPU would only have to go through the first 8 bits instead of extending the 8-bit integer to 64 bits and then going through all 64 bits.
At first I thought you were asking if it was possible to have a CPU with no definitive register size. That make no sense since the number and size of the registers is a physical property of the hardware and cannot be changed.
However some architecture let the programmer work on a smaller part of a register or to pair registers.
The x86 does both for example, with add al, 9 (uses only 8 bits of the 64-bit rax) and div rbx (pairs rdx:rax to form a 128-bit register).
The reason this scheme is not so diffuse is that it comes with a lot of trade-offs.
More registers means more bits needed to address them, simply put: longer instructions.
Longer instructions mean less code density, more complex decoders and less performance.
Furthermore most elementary operations, like the logic ones, addition and subtraction are already implemented as operating on a full register in a single cycle.
Finally, one execution unit can handle only one instruction at a time, we cannot issue eight 8-bit additions in a 64-bit ALU at the same time.
So there wouldn't be any improvement, nor in the latency nor in the throughput.
Accessing partial registers is useful for the programmer to fan-out the number of available registers, so for example if an algorithm works with 16-bit data, the programmer could use a single physical 64-bit register to store four items and operate on them independently (but not in parallel).
The ISAs that have variable length instructions can also benefit from using partial register because that usually means smaller immediate values, for example and instruction that set a register to a specific value usually have an immediate operand that matches the size of register being loaded (though RISC usually sign-extends or zero-extends it).
Architectures like ARM (presumably others as well) supports half precision floats. The idea is to do what you were speculating and #Margaret explained. With half precision floats, you can pack two float values in a single register, thereby introducing less bandwidth at a cost of reduced accuracy.
Reference:
[1] ARM
[2] GCC

8 and 16 bit architecture

I'm a bit confused about bit architectures. I just cant find a good article that answers my questions, so I figured I'd ask SO.
Question 1:
When speaking of a 16 bit architecture, does it mean each ram address is 16 bits long? So if I create an int (32 bit) in C++ the variable would take up 2 addresses?
Question 2:
in a 16 bit architecture there are only 2^16 (65536) amount of addresses inside the RAM. Why can't they add more? Is this because 16 bit can't represent a higher value and therefore can't reference to adresses above 65535?
When speaking of a 16 bit architecture, does it mean each ram address is 16 bits long? So if I create an int (32 bit) in C++ the variable would take up 2 addresses?
You'd have to ask whoever was speaking of a 16-bit architecture what they meant by it. They could mean addresses are 16-bits long. They could mean general-purpose CPU registers are 16-bits long. They could mean something else. But there's no way we could know what some hypothetical person might mean. There is no universal definition of what makes something a "16-bit architecture".
For example, the 8032 is an 8-bit architecture with 8-bit general purpose registers. But it has a 16-bit pointer register that can be used to address 65,536 bytes of storage.
Regardless of bitness, almost all systems use byte addresses. So a 32-bit variable will take up 4 addresses on a machine of any bitness.
in a 16 bit architecture there are only 2^16 (65536) amount of addresses inside the RAM. Why can't they add more? Is this because 16 bit can't represent a higher value and therefore can't reference to adresses above 65535?
With 16-bits, there are only 65,536 possible ways those bits can be set. So a 16-bit register has 65,536 possible values.
Yes. Note, though that int on 16-bit architectures is usually just 16 bits wide.
Also note that it doesn't make sense to say that a variable "takes up" two addresses. The correct thing to say is that a 32-bit variable is as wide as two pointers on a 16-bit platform.
It will still occupy four bytes of space, no matter what architecture.
Yes; that's exactly what 16-bit addresses mean.
Note that each of these addresses points to a single byte of memory.
Depends on your definitions of 8-bit and 16-bit architecture.
The 6502 was considered an 8-bit CPU, because it operated on 8-bit values (the register size), yet had 16-bit addresses.
The 68000 was considered a 16-bit CPU, yet had 32-bit registers and addresses.
With x86, it is generally the address size that defines the architecture.
Also, '64-bit' CPUs don't always have a full 64-bit external address bus. They might internally handle addresses of that size, so the virtual address space can be large, but it doesn't mean they can have that much external memory.
Example From Wikipedia - All internal registers, as well as internal and external data buses, were 16 bits wide, firmly establishing the "16-bit microprocessor" identity of the 8086. A 20-bit external address bus gave a 1 MB physical address space (2^20 = 1,048,576). This address space was addressed by means of internal 'segmentation'. The data bus was multiplexed with the address bus in order to fit a standard 40-pin dual in-line package. 16-bit I/O addresses meant 64 KB of separate I/O space (2^16 = 65,536). The maximum linear address space was limited to 64 KB, simply because internal registers were only 16 bits wide. Programming over 64 KB boundaries involved adjusting segment registers (see below) and remained so until the 80386 introduced wider (32 bits) registers (and more advanced memory management hardware).
So you can see that there are no fixed rules that a 16 bit architecture will have 16 address lines only. Don't mix up two things, though it's intuitive to believe so.

How do we determine if a processor is 8-bit; 16-bit or 32-bit

Is it determined by size of the address buss; if yes then was 8086 a 20-bit processor? If no what is criteria for assigning a bit number like 8-bit, 16-bit, 32-bit to processor?
It's not well defined. Broadly, as xtofl points out, it's the size of the atomic unit of computation (in early computers, this wasn't always synonymous with "register"). So the PDP-10 was a 36 bit machine, a 8080 was 8 bit, and a IBM 360 or Intel 80386 is "32 bits".
But there are exceptions. The Motorola 68000 and 68010 CPUs implemented a 32 bit register set, but did it via microcode on top of a mostly 16 bit internal architecture. They were usually marketed as "16 bit" CPUs at the time.
The size of the address bus is almost never the defining factor. All successful "8 bit" CPUs implemented 16 bit addressing, for example (often via odd hacks to make up for the lack of a single address register, c.f. 6502's indirect addressing modes or the Z80's H/L registers). And the 8086, as you mention, used its segment register addressing to get 20 address lines to work (the 80286 extended this trick to 24 bits of physical address). And in the other direction, many "32 bit" CPUs had smaller address buses to save logic that wouldn't be used on a machine that would never have more than a few megabytes of memory: the 68000 limited addressing to 24 bits, even though the pointers themselves were 32. Likewise modern 64 bit CPUs universally implement less than 64 bits of physical address.
As far as i know the bit width of the processor is determined by how many bits the internal data processing circuits accept at once. Like if the adders, multipliers etc in the ALU accept 16 bit operands then the CPU is 16 bit, and if it accepts 32 bits then it is 32 bit. It does not matter what is the width of the data bus or the address bus. In general the bit length of the Accumulator would determine the bit length of the processor.
I guess normally you label it by the size of it's accumulators/registers.
With respect to a CPU, I'd say that it's the width of a register. You can do an operation on only 8 bits, 16-bits, 32-bits, etc. at a time.
The bit size (8-bit, 16-bit, 32-bit) of a microprocecessor is determined by the hardware, specifically the width of the data bus. The Intel 8086 is a 16-bit processor because it can move 16 bits at a time over the data bus. The Intel 8088 is an 8-bit processor even though it has an identical instruction set. This is similar to the Motorola 68000 and 68008 processors. The bit size is not determined by the programmer's view (the register width and the address range).
I think the first number of Integrated chip refers to the type of the processor.
If it is IC 8085 means it is a 8 bit processor.
any processor can be designated by its' two attributes
instruction set architecture &
no. of bits it can handle in single clock cycle.
take for example Intel's IA-32 architecture, also called x86-32 , here x86 indicates the architecture and 32 indicates 32-bit processor
X-Architecture
there are a number of architectures
Pre-x86 x86
-Intel's IA-32 architecture, also called x86-32 -x86-64
- -with AMD's AMD64 and Intel's Intel 64 version of it
- Motorola's 6800 and 68000 a
rchitectures ARM7
Y-bit processor
: simply- its the data handling capability of cpu/processor in a single clock cycle.
suppose it is an 8 bit processor then in a single clock cycle, the ALU can perform operation on 8 bit data only.(note that this operation may be an internal operation like add/sub as well as transferring data to other IO device)
classification Based on Registers-
Processor in addition to ALU and CU contains some memory locations as well, called as registers. depending on the processor, a register may typically store 8, 16, 32 or 64 bits. The register size of a particular processor allows us to classify the processor. Processors with a register size of n-bits are called n-bit processors, so that processors with 8-bit registers are called 8-bit processors.
classification Based on databus width-
since the alu can handle only 8 bit data in a single clock cycle it won't make sense to have data bus width more than that & 8 bit processor will have 8 bit wide databus, hence databus width can also be an alternate way to find out the bit processing capability of processor.for a processor with n bit databus means that the CPU can transfer n-bits to another device in a single operation.
for the question:
"suppose we have a 32 bit ALU i.e. it can take 32 bits at a time and
our data bus size is 16 bit i.e. it can hold 16 bit of data at a time
thn wht will be the ans. In this case...?"
the example of such processor would be intel 8088 & Moto 68000
Using bus width classification, the Intel 8088 microprocessor is an 8-bit processor since it uses an 8-bit data bus, although its CPU registers are in fact 16-bit registers.
Similarly the Motorola 68000 is classified as a 16-bit processor, even though its CPU registers are 32-bit registers.
Sometimes a combination of the two classifications is used where the 8088 might be described as 8/16-bit processor and the Motorola 68000 as a 16/32-bit processor.
The word size(8-bits, 16-bits or 32-bits) of a microprocessor is the size of the data path in the execution unit. Typically, this is the size of the accumulator.
This is the execution unit size. An example where this matters is the 8088, which is a 16 bit computer running on an 8 bit bus. The 8085 is 8-bits. The 8086/8088 is 16-bits. The 80386 is 32-bits. Mordern Intel Processors are 64-bits.

Resources