For a Intel(R) Core(TM)2 Quad CPU Q8400 # 2.66GHz model cpu I am getting both the NumberOfCores and NumberOfLogicalProcessors as 4 .
I want to know how system is calculating NumberOfLogicalProcessors ?
What i should use to get the actual number of cpus ?
OS: win2k8 R2
That depends on what you meant by actual CPU.
Win32_Processor\NumberOfCores specifies the total number of physical CPU cores. A chip can contain one or more CPU cores.
Win32_Processor\NumberOfLogicalProcessors specifies the total number of virtual CPU cores. There can be two or more virtual CPU cores in one physical CPU core. On x86 compatible computers, this is only available in Intel's Hyper-Threading capable CPUs.
On the other hand, the Win32_ComputerSystem\NumberOfProcessors specifies the total number of physical processor chips installed on a multi processor motherboard.
The Win32_ComputerSystem\NumberOfLogicalProcessors is same as Win32_Processor\NumberOfLogicalProcessors.
Related
I'm confused about how multiple launches of same python command bind to cores on a NUMA Xeon machine.
I read that OMP_NUM_THREADS env var sets the number of threads launched for a numactl process. So if I ran numactl --physcpubind=4-7 --membind=0 python -u test.py with OMP_NUM_THREADS=4 on a hyperthreaded HT machine (lscpu output below) it'd limit the this numactl process to 4 threads.
But since machine has HT, it's not clear to me if 4-7 in the above are 4 physical or 4 logical.
How to find which of the numa-node-0 cores in 0-23,96-119 are physical and which ones logical? Are 96-119 all logical or are they interspersed?
If 4-7 are all physical cores, then with HT on there would be only 2 physical cores needed, so what happens to the other 2?
Where is OpenMP library getting invoked in binding threads to physical cores?
(from my limited understanding I could just launch command python main.py in a sh shell 20 times with different numactl bindings and OMP_NUM_THREADS still applies, even though I didn't explicitly use MPI lib anywhere, is that correct?)
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 2
Core(s) per socket: 48
Socket(s): 2
NUMA node(s): 4
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Platinum 9242 CPU # 2.30GHz
Stepping: 7
Frequency boost: enabled
CPU MHz: 1000.026
CPU max MHz: 2301,0000
CPU min MHz: 1000,0000
BogoMIPS: 4600.00
L1d cache: 3 MiB
L1i cache: 3 MiB
L2 cache: 96 MiB
L3 cache: 143 MiB
NUMA node0 CPU(s): 0-23,96-119
NUMA node1 CPU(s): 24-47,120-143
NUMA node2 CPU(s): 48-71,144-167
NUMA node3 CPU(s): 72-95,168-191
I read that OMP_NUM_THREADS env var sets the number of threads launched for a numactl process.
numactl do not launch threads. It controls NUMA policy for processes or shared memory. However, OpenMP runtimes may adapt the number of threads created by a region based on the environment set by numactl (although AFAIK this behaviour is undefined by the standard). You should use the environment variable OMP_NUM_THREADS to set the number of threads. You can check the OpenMP configuration using the environment variable OMP_DISPLAY_ENV.
How to find which of the numa-node-0 cores in 0-23,96-119 are physical and which ones logical? Are 96-119 all logical or are they interspersed?
This is a bit complex. Physical IDs are the ones available in /proc/cpuinfo. They are not guaranteed to stay the same over time (eg. they can change when the machine is restarted) nor "intuitive" (ie. following rules like being contiguous for threads/cores close to each other). One should avoid hard-coding them manually. e.g. a BIOS update or kernel update might lead to enumerating logical cores in a different order.
You can use the great tool hwloc to convert well-defined deterministic logical IDs to physical ones. Here, you cannot be entirely sure that 0 and 96 are two threads sharing the same core (although this is probably true here for your processor, where it looks like the kernel enumerated one logical core from each physical core as cores 0..95, then 96..191 for the other logical core on each physical core). The other common possibility is for Linux to do both logical cores of each physical core consecutively, making logical cores 2n and 2n+1 share a physical core.
If 4-7 are all physical cores, then with HT on there would be only 2 physical cores needed, so what happens to the other 2?
--physcpubind of numctl accepts physical cpu numbers as shown in the "processor" fields of /proc/cpuinfo regarding the documentation. Thus, 4-7 here should be interpreted as physical thread IDs. Two threads IDs can refer to the same physical core (which is always the case on Intel processors with hyper-threading enabled).
Where is OpenMP library getting invoked in binding threads to physical cores?
AFAIK, this is implementation dependent of the OpenMP runtime used (eg. GOMP, IOMP, etc.). The initialization of the OpenMP runtime is often done lazily when the first parallel section is encountered. For the binding, some runtimes read /proc/cpuinfo manually while some other use hwloc. If you want deterministic bindings, then you should use the OMP_PLACES and OMP_PROC_BIND environment variables to tell the runtime to bind threads using a custom user-defined method and not the default one.
If you want to be safe and portable, use the following configuration (using Bash):
OMP_NUM_THREADS=4
OMP_PROC_BIND=TRUE
OMP_PLACES={$(hwloc-calc --physical-output --sep "},{" --intersect PU core:all.pu:0)}
The OpenMP threads will be scheduled on OpenMP places. The above configuration configure the OpenMP runtime so that there will be 4 threads statically map on 4 different fixed cores.
What does it mean when they say processor is -
Core i3, i5 or i7?
Core 2 Duo or Dual Core?
CPU # 3.20 GHz or similar GHz?
You mixed up everything a bit :
Core i3, i5, i7, 2 Duo are brand names used by Intel for families of processors it builds.
A dual core processor (or more generally multi-core) is a processor which features multiple processing units. This allows to have multiple threads (i.e. somehow a list of consecutive instructions to be executed), so that 2 or more instruction can be executed at a time.
3.20 GHz is the frequency of the clock of the processor. It is in Hertz (Hz). A higher frequency for a processor with the exact same components while make the processor execute instructions faster (most of the time). However, a processor with a high frequency will not necessarly be faster than a processor with a lower frequency.
See :
Multi-core processor
Clock rate
Intel core
These are names and types of different processors. The vary in the number of cores and clock rate.
The number of the cores is the number of unites which do actually add values. (See ALU)
The clock rate shows how many additions the CPU can make in a specified time.
Help yourself and read:
http://en.wikipedia.org/wiki/Central_processing_unit
http://en.wikipedia.org/wiki/Arithmetic_logic_unit
http://en.wikipedia.org/wiki/Comparison_of_Intel_processors
http://en.wikipedia.org/wiki/Comparison_of_AMD_processors
I would like to bind a specific processor core to a specific RAM DIMM.
Example:
CPU-> Intel i7 2600K (Quadcore) where each core is listed as 0,1,2,3
RAM-> 2 * 2GB DDR3 (Specifics not necessary) listed as 0,1
I would like to execute a program that runs on CPU core 2, and use only RAM DIMM 0.
The closest tool to use would be numactl, but numactl (as far as I have read) controls the memory of a node as a whole not per DIMM.
Your help would be much appreciated...
I bougth recently a server with 2 x X5550, they are quad (4 cores each) total 8 cores
If I check the task manager it shows in the CPU usage history 16 diagrams,
Should't it be 8 cause I have 2 processors with quad?
or the diagrams maybee shows the Threads of the CPU?
The CPUs have support for HyperThreading, so each core x2 logical CPUs.
You can always lookup the chip specs on Intel's site
On a dual quad-core GetProcessAffinityMask (or the dialog from "Set affinity" in taskman.exe) will report eight logical processors. How do I find out which logical processor is on which physical processor? Especially: which logical processors are on the same physical processor?
EDIT: If it is not possible to do this programmatically, do anyone just know what the normal mapping is? Are the first four on the first processor and the second four on the second or are the odd numbered on the first and the even numbered on the second?
You can use Win32_Processor WMI class to query the number of cores, number of logical processors, architecture, cache memory and other information about the CPUs on the system.
To query information about the relationship between the logical processors in a system, you can use GetLogicalProcessorInformation API function.
In case you don't want to write the code yourself, SysInternal's handy coreinfo utility comes closest to answering your questions. It implements GetLogicalProcessorInformation as Mehrdad recommends. For a Xeon E5640 (quad core, 8 threads), you get from coreinfo:
c:\App\SysInternals>Coreinfo.exe -c
Coreinfo v3.0 - Dump information on system CPU and memory topology
Copyright (C) 2008-2011 Mark Russinovich
Sysinternals - www.sysinternals.com
Logical to Physical Processor Map:
**------ Physical Processor 0 (Hyperthreaded)
--**---- Physical Processor 1 (Hyperthreaded)
----**-- Physical Processor 2 (Hyperthreaded)
------** Physical Processor 3 (Hyperthreaded)
There are 8 * for the 8 hyperthreads, two per core, as expected for this chip. What's not clear, though, is how the arrangement of * matches up with the list of logical processors as Windows presents them. For instance, Task Manager gives me a dialog for assigning the processor affinity, labeled CPU 0 through CPU 7, for any process. It's fair (but not necessary) to assume that you can take coreinfo's output and number the logical processors left-to-right. So "CPU 5" would be the second hyperthread running on physical processor 2.
The numbering is done in a sequential manner: first all physical cores followed by the logical cores [1] .
[1] CPU Numbering on a hypertheading enabled system