Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
I am currently volunteering to learn about linux servers and also I am interested in learning about cluster computing techniques.
In this lab, they have a small cluster with one head node and two compute nodes.
When I tried the lscpu command on head node, compute node1,node2. Click the link to view the details.
CPUs - 24 in head, computenode1 and computenode2. Is it referring to 24 physical CPUs in the motherboard?
Sockets - 2 in head, computenode1 and computenode2.Can anyone explain it?
Cores per socket - 6 in head, computenode1 and computenode2.Can anyone explain it?
Threads per core - 2 in head, computenode1 and computenode2.Can anyone explain it?
A socket is the physical socket where the physical CPU capsules are placed. A normal PC only have one socket.
Cores are the number of CPU-cores per CPU capsule. A modern standard CPU for a standard PC usually have two or four cores.
And some CPUs can run more than one parallel thread per CPU-core. Intel (the most common CPU manufacturer for standard PCs) have either one or two threads per core depending on CPU model.
If you multiply the number of socket, cores and threads, i.e. 2*6*2, then you get the number of "CPUs": 24. These aren't real CPUs, but the number of possible parallel threads of execution your system can do.
Just the fact that you have 6 cores is a sign you have a high-end workstation or server computer. The fact that you have two sockets makes it a very high-end computer. Usually not even high-end workstations have that these days, only servers.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I have to install an Oracle 18c database server, currently I have 4 960 GB solid state hard drives, 32 2.10 ghz Intel Xeon processors, 255 GB of RAM, the database that Mounted on this server is approximately 750 GB in weight and is increased by 350 GB per year and has a concurrency of 500 simultaneous connections, do you consider that this hardware is the one to use to obtain good performance in Oracle or should it be increased?
The answer is, "it depends":
Is this enough storage capacity to meet your long term needs for data, archived transaction logs, backups, auditing, and whatever else you need to store?
Do you have an existing production system, and what sort of performance do you get out of it? How does the old hardware and new hardware compare?
How many transactions does your database process in an hour? Your overall data might only grow by 350GB, but if there are a lot of updates and not just inserts then you you could be archiving that much in log files and backups every day.
What those 500 concurrent sessions are actually doing at any moment will drive the size of your memory and processor requirements, as will the amount of data that you need to cache to support them.
Do you have any HA requirements, and if so how does that affect the configuration of your storage (i.e. do you lose some percentage of your storage to RAID)?
Which operating system you choose can also affect how efficiently your hardware performs.
My personal preference is to use virtualized servers running Oracle Linux with an SSD SAN , but that might not be right for you. In the end there really isn't enough information in your question to say for sure if the hardware you describe is sufficient for your needs or not.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
Improve this question
I'm about to build a desktop computer and I'm trying to understand how are this PCIe lanes distributed. The goal is being able to calculate how many lanes do I need for a certain setup. I'm looking at the Asus Z170-P motherboard, which according the specifications [1]:
It contains the Z170 chipset.
You can read on the board that it is "CrossfireX Ready" which I believe implies you could plug in 2 graphic cards.
The specs say it has two PCIe x16 slots, one that works at x16 mode and another one that only works at x4 mode.
First, according to the Z170 chipset specifications, it supports up to 20 PCIe lanes. However, there is no single processor that fits into the LGA1151 socket with support for 20 or more PCIe lanes [2]. Why have a chipset with support for 20 lanes when the processor will only be able to handle up to 16?
Second, supported PCIe port configurations by the chipset are "1x16, 2x8, 1x8+2x4". If I were to plug in two graphic cards, would they both work at x4 mode or x8/x4 modes? Shouldn't a motherboard designed for using two graphic cards be able to handle 32+ PCIe lanes so both graphic cards work at x16 mode?
The (up to) 20 PCIe lanes from the Z170 are in addition to the 16 lanes that come directly out of the CPU.
I don't see any reason that it wouldn't run one graphics card at 16x and one at 4x. But it does seem odd to me that they call it "Crossfire-ready" without 2 x16 slots.
More info on the Z170 here:
http://www.tomshardware.com/reviews/skylake-intel-core-i7-6700k-core-i5-6600k,4252-2.html
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
A PC has a microprocessor which processes 16 instructions per microsecond. Each instruction is 64 bits long. Its memory can retrieve or store
data/instructions at a rate of 32 bits per microsecond.
Mention 3 options to upgrade system performance. Which option gives most improved performance?
And the answer provided is
a) upgrade processor to one with twice the speed
b) upgrade memory with one twice the speed
c) double clock speed
(b) gives most improved performance.
Overcoming the bottleneck of a PC can improve the integrated performance.
However, my problem is that I am not sure of why b gives the most improved performance. Additionally, would a and c give the same performance? Will it provide the same performance? Can it be calculated? I am not sure of how these different parts would work on the performance.
Your question's leading paragraph contains the necessary numbers to see why it's b):
The CPU's processing rate is fixed at 16 instructions per microsecond. So an instruction takes less than a microsecond to execute.
Each instruction is 64 bits long, but the memory system retrieves data at 32 bits per microsecond. So it takes two microseconds to retrieve a single instruction (i.e. 64 bits).
The bottleneck is clear: it takes longer to retrieve an instruction (2μs) than it does to execute it (1/16thμs).
If you increase the CPU speed (answer a)), the CPU will execute an individual instruction faster, but it will still be waiting idle at least 2μs for the next instruction to arrive, so the improvement is wasted.
To eliminate bottlenecks you need to increase the memory-system's speed to match the CPU's execution speed, so the memory needs to read 64 bits in a 1/16μs (or 32 bits in 1/32μs).
I assume answer c) refers to increasing the speed of some systemwide master clock which would also increase the CPU and Memory data-rates. This would improve performance, but the CPU would still be slaved to the memory speed.
Note that your question describes a simplistic computer. Computers were like this originally, where the CPU accessed memory directly, instruction-by-instruction. However as CPUs got faster, memory did not - so computer-engineers added cache levels: this is much faster memory (but much smaller in capacity) where instructions (and data memory) can be read as fast as a CPU can execute them, solving the bottleneck without needing to make all system memory match the CPU's speed.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have a mix of Dell machines, all bought at the same time, with only slightly different specs. The Windows Experience index is almost identical.
Yet my program runs at a speed difference of 25-40% (i.e. really noticeable) on the machines.
They are all out-of-the-box business Dells, no extra programs running (not that take up significant resources anyway).
My program is graphics-based, loading in a lot of data and then processing it on the CPU. CPU usage is identical on all machines, I only use a single thread.
I'd expect maybe 5-10% variation based on the processors (according to benchmarks).
My programmer spidey-sense tells me that something is wrong here.
Is this going to be something like cache misses? Is there anything else I should be looking at?
I have used debugging programs such as WinDbg for these situations. There are a lot of tools in those programs to nail down where the exact bottleneck is. E.g. Run them side by side and identify which point the slower machine is lagging. If physical specs of the machines are identical, it is most likely some difference in network configuration if there is a bottleneck when it is downloading the graphics. In that case, a tool such as WireShark would show you what hops the application is taking over the network to retrieve the data. If netwrok configuration is identical, I wouldn't rule out a physcial problem with the machine such as faulty ram or dodgy network cable. Also, check out the running processes side by side and see if there is any difference, kill unnessessary tasks that may be taking up memory on the slower computer and remove if necessary.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Which is the recommended Amazon EC2 instance size for creating a large (100k+ users) Ejabberd cluster?
I mean, is it more efficient/less costly to use a larger number of small instances, or a smaller number of large instances?
And are HVM images, for cluster computing, of any use for an Ejabberd cluster, or standard images will be enough for the purpose?
eJabberd can use lots of memory but doesn't use a lot of CPU so memory is the biggest consideration. It really depends how many connections you're talking about. 100k+ connections you will need a large instance at least.
From MetaJacks blog (The creater of Strophe, the Javascript XMPP library)
"For Chesspark, we use over a gig of RAM for a few hundred connections. Jabber.org uses about 2.7GB of RAM for its 10k+ connections."
A large instance has 7.5GB of RAM, which isn't enough for 100k+ connections. I'd say you're looking at a 2-3 server cluster of Large instances or a High Memory Instance.
HVM is only really needed when you need hardware support that is not provided by the software virtual machine (e.g. Graphics card processing). Not required for Memory or CPU.