I am beginner in cluster configuration. I know in our cluster we have types of worker nodes:
16 x 4TB Disks
128 RAM
2 x 8 Core CPUs
12 x 1.2 TB Disks
256 RAM
2 x 10 Core CPUs
I am confused about the configuration. What does mean 2 x 8 cores? It means 2 processor with 8 physical core each? So if my processor are hyperthreading i will have 2 X 8 X 2 = 32 virtual cores?
And 12 x 1.2 TB means, 12 disks with 1.2 TB each?
Usually 2x 8 Core CPUs, means, that you have 2 physical chips on your motherboard, each having 8 Cores. If you enable hyperthreading, you then have 32 virtual cores.
The amount of disks is either the way, like you stated it, or its the number of nodes. Then you have 16 nodes with 4TB disk.... and 12 nodes with 1.2TB disk ....
I am just wondering, how someone can get to this hardware, not knowing what it means. Can you send me some nodes? :)
Related
I want to design a "pricing calculator" for a cloud platform that takes into account different "service Types (Infra or PaaS)", "VM sizes", "storage", "Compute units (a custom metric of cpu cores + RAM for PaaS services)". These options can be added in any order by customer with any combination.
ex:For Infra: small (2 cpu cores + 2 GB RAM), medium (4 cpu cores + 4 GB),
large (6 CPU cores + 8 GB of RAM) and so on...
For Storage: by GB allocated (both pay as u go and pre-paid and post paid)
For PaaS: Compute Units. Ex: Provisioning 10 CU's per hour translates to 2 CPU cores and 4 GB of RAM.
I want a data structure that can represent all the above combinations of variables
My University has computational nodes with 128 total cores but comprised of two individual AMD processors (i.e., sockets), each with 64 cores. This leads to anomalous simulation runtimes in ABAQUS using crystal plasticity models implemented in a User MATerial Subroutine (UMAT). For instance, if I run a simulation using 1 node and 128 cores, this takes around 14 hours. If I submit the same job to run across two nodes with 128 cores (i.e., using 64 cores/1 processor on two separate nodes), the job finishes in only 9 hours. One would expect the simulation running on a single host node to be faster than on two separate nodes for the same total number of cores, but this is not the case. The problem is that in the latter configuration, each host node contains two processors each with 64 cores and the abaqus_v6.env file therefore contains:
mp_host_list=[['node_1', 64],['node_2', 64]]
for the 2 node/128 core simulation. The ABAQUS .msg file then accordingly splits the job into two processes each with 64 threads:
PROCESS 1 THREAD NUMBER OF ELEMENTS
1 3840
2 3840
...
63
64 3840
PROCESS 2 THREAD NUMBER OF ELEMENTS
1 3584
2 4096
...
63
64 3840
The problem arises when I specify a single host node with 128 cores because ABAQUS has no way of recognizing that the host node consists of two separate processors. I can modify the abaqus_v6.env file accordingly as:
mp_host_list=[['node_1', 64],['node_1', 64]]
but ABAQUS just clumps this into one process with 128 threads, and I believe this is why my simulations actually run quicker on two nodes instead of one with the same number of cores, because ABAQUS does not recognize that it should treat the single node as two processors/processes.
Is there a way to specify two processes on the same host node in ABAQUS?
As a note, the amount of memory/RAM reserved per core does not change (~2 GB per core).
Final update: able to reduce runtimes using multiple nodes
I found that running these types of simulations across multiple nodes reduces run times. A table of simulation speeds for two models across various numbers of cores, nodes, and cores/processor are listed below.
The smaller model finished in 9.7 hours on two nodes with 64 cores/node = total of 128 cores. The runtime reduced by 25% when simulated over four nodes with 32 cores/node for the same total of 128 cores. Interestingly, the simulation took longer using three nodes with 64 cores/node (total of 192 cores), and there could be many reasons for this. One surprising result was that the simulation ran quicker using 64 nodes split over two nodes (32 cores/socket) vs. 64 cores on a single socket, which means the extra memory bandwidth of using multiple nodes helps (details of which I do not fully understand).
The larger model finished in ~32.5 hours using 192 cores and there was little between using three (64 cores/processor) or six (32 cores/processor) nodes, which means that at some point, using more nodes does not help. However, this larger model finished in 36.7 hours using 128 cores with 32 cores/processor (four nodes). Thus, the most efficient use of nodes for both the larger and smaller model is with 128 CPUs split over four nodes.
Simulation details for a model with 477,956 tetrahedral elements and 86,153 nodes. Model is cyclically strained to a strain of 1.3% for 10 cycles with a strain ratio R = 0.
# CPUs
# Nodes
# ABAQUS processes: actual
# ABAQUS processes: ideal
Notes on cores per processor
Wall time (hr)
Notes
64
1
1
2
32 cores/processor
13.8
Using cores on two processors but unable to specify two separate processes
64
1
1
1
64 cores/processor
11.5
No need to specify two processes
64
2
2
2
32 cores/processor
10.5
Correctly specifies two processes. Surprisingly faster than the scenario directly above!
128
1
1
2
64 cores/processor
14.5
Unable to specify two separate processes
128
2
2
2
64 cores/processor
8.9
Correctly specifies two processes
128
2
2
4
~32 cores/processor; 4 total processors
9.9
Specifies two processes but should be four processes
128
2
2
3
64 cores/processor
9.7
Specifies two processes over three processors
128
4
4
4
32 cores/processor
7.2
32 cores per node. Most efficient!
192
3
3
3
64 cores/processor
7.6
Three nodes with three processors
192
2
2
4
64 and 32 cores/processor on both node
10.5
Four processors over two nodes
Simulation details for a model with 4,104,272 tetrahedral elements and 702,859 nodes. The model is strained to 1.3% strain and then back to 0% strain (one cycle).
# CPUs
# Nodes
# ABAQUS processes: actual
# ABAQUS processes: ideal
Notes on cores per processor
Wall time (hr)
Notes
64
1
1
1
64 cores/processor
53.0
Using a single processor
128
1
1
2
64 cores/processor
57.3
Using two processors on one node
128
2
2
2
64 cores/processor
40.9
128
4
4
4
32 cores/processor
36.7
Most efficient!
192
2
2
4
64 and 32 cores/processor on both node
42.7
192
3
3
3
64 cores/processor
32.4
192
6
6
6
32 cores/processor
32.6
**cpu :** E5-2630L * 2
**os :** Linux CentOS 6.3
physical core : 12
logical core : 24 (grep -c processor /proc/cpuinfo, by hyper threading)
E5-2630L has 6 cores, so total 24. (6*2*2)
but /proc/pid/status is
- Cpus_allowed: ffffffff,ffffffff
- Cpus_allowed_list: 0-63
cpu has 24 logical cores, but why cpu_allowed is 64?
It is the default, it just means there is no further restriction (besides the available hardware). I think the mask is a multiple of 32bit, but it always starts with 2 times that.
What I know is
Number of Logical Processor = Core x Sockets x HT
Is it right ? How many Virtual Machines are possible to provision with this logical processor ?
Exactly , so if it has 2 Procs with 4 Cores and HT enabled then the
Number of logical processor = 2 x 4 x 2 = 16
ESX will also use a core, so if you take the simplistic view of
allocating cores to VMs, you only have 7 to "allocate".
Have a look at this VMware community question on ESXi CPU.
I bougth recently a server with 2 x X5550, they are quad (4 cores each) total 8 cores
If I check the task manager it shows in the CPU usage history 16 diagrams,
Should't it be 8 cause I have 2 processors with quad?
or the diagrams maybee shows the Threads of the CPU?
The CPUs have support for HyperThreading, so each core x2 logical CPUs.
You can always lookup the chip specs on Intel's site