I read about Oracle connection pooling size. Official documents say:
For example, suppose a server has 2 CPUs and each CPU has 18 cores.
Each CPU core has 2 threads. Based on the Oracle Real-Wold Performance
group guidelines, the application can have between 36 and 360
connections to the database instance.
My Oracle server's NUM_CPUS are 16, NUM_CPU_CORES are 8 but NUM_CPU_SOCKETS are 2. It means we have 2 CPU in fact but with multithreading it works like 16 CPU.
I am not sure which one to use in connection formula. Probably 16 but I'm asking here to be sure.
16cpu * 8cores = min 128 connection?
or
2cpu * 8cores = min 16 connection?
Which one applies to me. :/
Thanks in advance.
You should use 2 cpu * 8 cores = 16 minimum connections.
This sentence in the documentation implies that the rule is meant for physical CPUs and cores: "The number of connections should be based on the number of CPU cores and not the number of CPU core threads." Your NUM_CPUS value of 16 must be a "logical" CPU since the system only has 2 sockets.
I am beginner in cluster configuration. I know in our cluster we have types of worker nodes:
16 x 4TB Disks
128 RAM
2 x 8 Core CPUs
12 x 1.2 TB Disks
256 RAM
2 x 10 Core CPUs
I am confused about the configuration. What does mean 2 x 8 cores? It means 2 processor with 8 physical core each? So if my processor are hyperthreading i will have 2 X 8 X 2 = 32 virtual cores?
And 12 x 1.2 TB means, 12 disks with 1.2 TB each?
Usually 2x 8 Core CPUs, means, that you have 2 physical chips on your motherboard, each having 8 Cores. If you enable hyperthreading, you then have 32 virtual cores.
The amount of disks is either the way, like you stated it, or its the number of nodes. Then you have 16 nodes with 4TB disk.... and 12 nodes with 1.2TB disk ....
I am just wondering, how someone can get to this hardware, not knowing what it means. Can you send me some nodes? :)
My laptop has 4 logical processors (two physical); logical CPUs 1 and 2 map to core 1, and logical CPUs 3 and 4 map to core 2 (verified with GetLogicalProcessorInformation()).
I ran a multithreaded matrix multiplication program on my computer with two threads. The first time, I used SetProcessAffinityMask(hProcess, 0x5) (which means logical processors 1 and 3) while the second time I used SetProcessAffinityMask(hProcess, 0xA) (logical processors 2 and 4).
It turned out that the first version was about twice as fast as the second version, as though I'd never multithreaded the second version anyway.
Does anyone have any guesses as to why this might be happening?
Measurements:
Plugged in (full CPU):
Affinity mask: 0x3 (0011b), 9 gflop/s
Affinity mask: 0x5 (0101b), 17 gflop/s
Affinity mask: 0x6 (0110b), 17 gflop/s
Affinity mask: 0x9 (1001b), 9 gflop/s
Affinity mask: 0xA (1010b), 9 gflop/s
Affinity mask: 0xC (1100b), 9 gflop/s
On battery (clocked down):
Affinity mask: 0x3 (0011b), 5 gflop/s
Affinity mask: 0x5 (0101b), 10 gflop/s
Affinity mask: 0x6 (0110b), 10 gflop/s
Affinity mask: 0x9 (1001b), 5 gflop/s
Affinity mask: 0xA (1010b), 2 gflop/s
(--> Very interesting, why half speed when on battery but normal speed on AC?! this one varies a lot between 1.5-2.5 gflop/s, unlike the others.)
Affinity mask: 0xC (1100b), 5 gflop/s
Does this imply that the fourth logical CPU is not doing anything (!)? (Everything with the mask for the fourth CPU set is slow.)
Update:
I just ran the same thing on the High Performance profile on batteries. The results are inconsistent: This time, I got 2x speedup for the masks 5, 6, and 10, but there was no speedup for mask 12. I'll try to run the tests again on AC power, but ultimately it seems like this result is a combination of power management, Turbo Boost, scheduling inconsistencies, etc., and it's more difficult to measure than I previously thought. :(
SetProcessAffinityMask() does not guarantee you will have one thread per core; only that the threads you have will run on the cores you have allowed.
Perhaps the OS is scheduling differently.
Also, I'm surprised 1 and 2 are on core 1. Usually, logical processor numbers interleave over physical cores, to provide an inherent load balancing. I would expect 1 and 3 to be on core 1, 2 and 4 to be on core 2.
No, not all cores are equal. Only one is the boot core. Furthermore, in many cases all IRQs (or at least IRQs from a majority of the devices) are directed to a single core.
More important to your observed behavior, not all sets of cores are equal. In a NUMA memory architecture (which have been relatively mainstream in x86 since Intel Hyperthreading and AMD Opteron), there's an ideal group of processors which can efficiently access a particular region of memory, and all other processors will pay a significant penalty to access that range.
With Hyperthreading, it's not main system memory that's connected non-uniformly, but L1 and L2 cache. If your process migrates between the two virtual processors associated to the same physical core, the cache remains valid. But if it migrates to the other physical core, cached data has to be copied and ownership transferred to the other cache. For some workloads, this could make a big difference.
It would be good to know what physical CPU this is, but I'm assuming from your phrasing about logical processors that there is 1 physical socket, 2 CPU cores, and hyperthreading is enabled giving you 4 logical processors.
The short answer is, for this complicated definition of "processor", no, not all processors are created equal. Hyperthreaded logical cores share execution resources, and if there's contention for those resources they won't be fast as separate physical cores. This sharing can take place at different levels for both hyperthreading and multicore processors (ALU, execution resources, cache at different levels, etc) but in broad terms, physical cores in the same socket won't be affected much by what the other core(s) is/are doing, and logical cores implemented by hyperthreading will be hugely affected by what their hypertwin is doing.
Another difference between different CPUs: As Ben said, your OS may process most hardware interrupts on a single CPU, which means that CPU will seem slower for other purposes, but I'd be surprised if the interrupt load is enough to impact performance anywhere near this much.
The results you got -- on processors A and B (being intentionally ambiguous about which 2 processors those are) you get double the performance of A alone, but on processors A and C you get approximately the same performance as A alone -- sure sound like hyperthreading is the difference, where A and C are hypertwins in the same physical core, and B is in the other physical core. You said that GetLogicalProcessorInformation() claims otherwise, but it's not unheard of for the BIOS tables on which that depends to have errors.
I would run Task Manager, keep an eye on loads on each CPU before you run your test to get an idea of how much else is going on and where Windows schedules it, then run your test again a few times, for different combinations of CPU affinity, and see if you can confirm or deny this theory.
Have you checked the return code from SetProcessAffinityMask to see if there was an error? If the call fails, you might get stuck on one logical processor. According to the documentation, you can only use the bits that are set in the result of GetProcessAffinityMask.
You say you've tried masks of 0x5, 0xA, and 0x9. I'd be curious to see the results with 0x3.
I bougth recently a server with 2 x X5550, they are quad (4 cores each) total 8 cores
If I check the task manager it shows in the CPU usage history 16 diagrams,
Should't it be 8 cause I have 2 processors with quad?
or the diagrams maybee shows the Threads of the CPU?
The CPUs have support for HyperThreading, so each core x2 logical CPUs.
You can always lookup the chip specs on Intel's site
On a dual quad-core GetProcessAffinityMask (or the dialog from "Set affinity" in taskman.exe) will report eight logical processors. How do I find out which logical processor is on which physical processor? Especially: which logical processors are on the same physical processor?
EDIT: If it is not possible to do this programmatically, do anyone just know what the normal mapping is? Are the first four on the first processor and the second four on the second or are the odd numbered on the first and the even numbered on the second?
You can use Win32_Processor WMI class to query the number of cores, number of logical processors, architecture, cache memory and other information about the CPUs on the system.
To query information about the relationship between the logical processors in a system, you can use GetLogicalProcessorInformation API function.
In case you don't want to write the code yourself, SysInternal's handy coreinfo utility comes closest to answering your questions. It implements GetLogicalProcessorInformation as Mehrdad recommends. For a Xeon E5640 (quad core, 8 threads), you get from coreinfo:
c:\App\SysInternals>Coreinfo.exe -c
Coreinfo v3.0 - Dump information on system CPU and memory topology
Copyright (C) 2008-2011 Mark Russinovich
Sysinternals - www.sysinternals.com
Logical to Physical Processor Map:
**------ Physical Processor 0 (Hyperthreaded)
--**---- Physical Processor 1 (Hyperthreaded)
----**-- Physical Processor 2 (Hyperthreaded)
------** Physical Processor 3 (Hyperthreaded)
There are 8 * for the 8 hyperthreads, two per core, as expected for this chip. What's not clear, though, is how the arrangement of * matches up with the list of logical processors as Windows presents them. For instance, Task Manager gives me a dialog for assigning the processor affinity, labeled CPU 0 through CPU 7, for any process. It's fair (but not necessary) to assume that you can take coreinfo's output and number the logical processors left-to-right. So "CPU 5" would be the second hyperthread running on physical processor 2.
The numbering is done in a sequential manner: first all physical cores followed by the logical cores [1] .
[1] CPU Numbering on a hypertheading enabled system