Can someone tell me the exact difference between multiprocessing and parallel processing? I am little bit confused. Thanks for your help.
Multi Processing
Multiprocessing is the use of two or more central processing units
(CPUs) within a single computer system. The term also refers to the
ability of a system to support more than one processor and/or the
ability to allocate tasks between them.
Parallel Processing
In computers, parallel processing is the processing of program
instructions by dividing them among multiple processors with the
objective of running a program in less time. In the earliest
computers, only one program ran at a time.
Multiprocessing: Running more than one process on a single processor
parallel processing: running a process on more than one processor.
Multiprocessing
A processing technique in which multiple processors or multiple processing cores in a single computer each work on a different job.
Parallel processing
A processing technique in which multiple processors or multiple processing cores in a single computer work together to complete one job more quickly.
Multiprocessing operating systems enable several programs to run concurrently. UNIX is one of the most widely used multiprocessing systems.Multiprocessing means the use of two or more Central Processing Units (CPU) at the same time. Most of new computers have dual-core processors, or feature two or more processors, therefore they are called multiprocessor computers.
Parallel Processing:The simultaneous use of more than one CPU to execute a program. Ideally, parallel processing makes a program run faster because there are more engines (CPUs) running it. Most computers have just one CPU, but some models have several. There are even computers with thousands of CPUs. With single-CPU computers, it is possible to perform parallel processing by connecting the computers in a network.
MULTIPROCESSING:simply means using two or more processors within a computer or having two or more cores within a single processor to execute more that more process at simultaneously.
PARALLEL PROCESSING:is the execution of one job by dividing it across different computer/processors.
Multiprocessing is doing a work with the use of many processors or cores.
Parallel processing is dividing one or more work into small parts and give every part a chance to process.
Related
I understand that tasks can be runned concurrently in multicore systems. Each tasks are seperated different core.
But in single core systems? Is it just task switching?
In most systems concurrency is simulated. The OS switches from task to task, allocating the resources.
In single core systems, this is the only way to run multiple threads at the same time. And even in multicore systems you would have something similar in place. In modern systems you simply have more tasks running than the number of cores you have. So often the OS would be shuffling through threads based on some factors, allocating them to the available cores.
I'm new to Elixir, and I'm starting to read through Dave Thomas's excellent Programming Elixir. I was curious how far I could take the concurrency of the "pmap" function, so I iteratively boosted the number of items to square from 1,000 to 10,000,000. Out of curiosity, I watched the output of htop as I did so, usually peaking out with CPU usage similar to that shown below:
After showing the example in the book, Dave says:
And, yes, I just kicked off 1,000 background processes, and I used all the cores and processors on my machine.
My question is, how come on my machine only cores 1, 3, 5, and 7 are lighting up? My guess would be that it has to do with my iex process being only a single OS-level process and OSX is managing the reach of that process. Is that what's going on here? Is there some way to ensure all cores get utilized for performance-intensive tasks?
Great comment by #Thiago Silveira about first line of iex's output. The part [smp:8:8] says how many operating system level processes is Erlang using. You can control this with flag --smp if you want to disable it:
iex --erl '-smp disable'
This will ensure that you have only one system process. You can achieve similar result by leaving symmetric multiprocessing enabled, but setting directly NumberOfShcedulers:NumberOfSchedulersOnline.
iex --erl '+S 1:1'
Each operating system process needs to have its own scheduler for Erlang processes, so you can easily see how many of them do you have currently:
:erlang.system_info(:schedulers_online)
To answer your question about performance. If your processors are not working at full capacity (100%) and non of them is doing nothing (0%) then it is probable that making the load more evenly distributed will not speed things up. Why?
The CPU usage is measured by probing the processor state at many points in time. This states are either "working" or "idle". 82% CPU usage means that you can perform couple of more tasks on this CPU without slowing other tasks.
Erlang schedulers try to be smart and not migrate Erlang processes between cores unless they have to because it requires copying. The migration occurs for example when one of schedulers is idle. It can then borrow a process from others scheduler run queue.
Next thing that may cause such a big discrepancy between odd and even cores is Hyper Threading. On my dual core processor htop shows 4 logical cores. In your case you probably have 4 physical cores and 8 logical because of HT. It might be the case that you are utilizing your physical cores with 100%.
Another thing: pmap needs to calculate result in separate process, but at the end it sends it to the caller which may be a bottleneck. The more you send messages the less CPU utilization you can achieve. You can try for fun giving the processes a task that is really CPU intensive like calculating Ackerman function. You can even calculate how much of your job is the sequential part and how much is parallel using Amdahl's law and measuring execution times for different number of cores.
To sum up: the CPU utilization from screenshot looks really great! You don't have to change anything for more performance-intensive tasks.
Concurrency is not Parallelism
In order to get good parallel performance out of Elixir/BEAM coding you need to have some understanding of how the BEAM scheduler works.
This is a very simplistic model, but the BEAM scheduler gives each process 2000 reductions before it swaps out the process for the next process. Reductions can be thought of as function calls. By default a process runs on the core/scheduler that spawned it. Processes only get moved between schedulers if the queue of outstanding processes builds up on a given scheduler. By default the BEAM runs a scheduling thread on each available core.
What this implies is that in order to get the most use of the processors you need to break up your tasks into large enough pieces of work that will exceed the standard "reduction" slice of work. In general, pmap style parallelism only gives significant speedup when you chunk many items into a single task.
The other thing to be aware of is that some parts of the BEAM use a spin/wait loop when awaiting work and that can skew usage when you use
a tool like htop to examine CPU usage. You'll get a much better understanding of your program's performance by using :observer.
What is the difference between a core and a processor?
I've already looked for it on Google, but I only get definitions for multi-core and multi-processor, which is not what I am looking for.
A core is usually the basic computation unit of the CPU - it can run a single program context (or multiple ones if it supports hardware threads such as hyperthreading on Intel CPUs), maintaining the correct program state, registers, and correct execution order, and performing the operations through ALUs. For optimization purposes, a core can also hold on-core caches with copies of frequently used memory chunks.
A CPU may have one or more cores to perform tasks at a given time. These tasks are usually software processes and threads that the OS schedules. Note that the OS may have many threads to run, but the CPU can only run X such tasks at a given time, where X = number cores * number of hardware threads per core. The rest would have to wait for the OS to schedule them whether by preempting currently running tasks or any other means.
In addition to the one or many cores, the CPU will include some interconnect that connects the cores to the outside world, and usually also a large "last-level" shared cache. There are multiple other key elements required to make a CPU work, but their exact locations may differ according to design. You'll need a memory controller to talk to the memory, I/O controllers (display, PCIe, USB, etc..). In the past these elements were outside the CPU, in the complementary "chipset", but most modern design have integrated them into the CPU.
In addition the CPU may have an integrated GPU, and pretty much everything else the designer wanted to keep close for performance, power and manufacturing considerations. CPU design is mostly trending in to what's called system on chip (SoC).
This is a "classic" design, used by most modern general-purpose devices (client PC, servers, and also tablet and smartphones). You can find more elaborate designs, usually in the academy, where the computations is not done in basic "core-like" units.
An image may say more than a thousand words:
* Figure describing the complexity of a modern multi-processor, multi-core system.
Source:
https://software.intel.com/en-us/articles/intel-performance-counter-monitor-a-better-way-to-measure-cpu-utilization
Let's clarify first what is a CPU and what is a core, a central processing unit CPU, can have multiple core units, those cores are a processor by itself, capable of execute a program but it is self contained on the same chip.
In the past one CPU was distributed among quite a few chips, but as Moore's Law progressed they made to have a complete CPU inside one chip (die), since the 90's the manufacturer's started to fit more cores in the same die, so that's the concept of Multi-core.
In these days is possible to have hundreds of cores on the same CPU (chip or die) GPUs, Intel Xeon. Other technique developed in the 90's was simultaneous multi-threading, basically they found that was possible to have another thread in the same single core CPU, since most of the resources were duplicated already like ALU, multiple registers.
So basically a CPU can have multiple cores each of them capable to run one thread or more at the same time, we may expect to have more cores in the future, but with more difficulty to be able to program efficiently.
CPU is a central processing unit. Since 2002 we have only single core processor i.e. we will only perform a single task or a program at a time.
For having multiple programs run at a time we have to use the multiple processor for executing multi processes at a time so we required another motherboard for that and that is very expensive.
So, Intel introduced the concept of hyper threading i.e. it will convert the single CPU into two virtual CPUs i.e we have two cores for our task. Now the CPU is single, but it is only pretending (masqueraded) that it has a dual CPU and performs multiple tasks. But having real multiple cores will be better than that so people develop making multi-core processor i.e. multiple processors on a single box i.e. grabbing a multiple CPU on single big CPU. I.e. multiple cores.
In the early days...like before the 90s...the processors weren't able to do multi tasks that efficiently...coz a single processor could handle just a single task...so when we used to say that my antivirus,microsoft word,vlc,etc. softwares are all running at the same time...that isn't actually true. When I said a processor could handle a single process at a time...I meant it. It actually would process a single task...then it used to pause that task...take another task...complete it if its a short one or again pause it and add it to the queue...then the next. But this 'pause' that I mentioned was so small (appx. 1ns) that you didn't understand that the task has been paused. Eg. On vlc while listening to music there are other apps running simultaneously but as I told you...one program at a time...so the vlc is actually pausing in between for ns so you dont underatand it but the music is actually stopping in between.
But this was about the old processors...
Now-a- days processors ie 3rd gen pcs have multi cored processors. Now the 'cores' can be compared to a 1st or 2nd gen processors itself...embedded onto a single chip, a single processor. So now we understood what are cores ie they are mini processors which combine to become a processor. And each core can handle a single process at a time or multi threads as designed for the OS. And they folloq the same steps as I mentioned above about the single processor.
Eg. A i7 6gen processor has 8 cores...ie 8 mini processors in 1 i7...ie its speed is 8x times the old processors. And this is how multi tasking can be done.
There could be hundreds of cores in a single processor
Eg. Intel i128.
I hope I explaned this well.
I have read all answers, but this link was more clear explanation for me about difference between CPU(Processor) and Core. So I'm leaving here some notes from there.
The main difference between CPU and Core is that the CPU is an electronic circuit inside the computer that carries out instruction to perform arithmetic, logical, control and input/output operations while the core is an execution unit inside the CPU that receives and executes instructions.
Intel's picture is helpful, as shown by Tortuga's best answer. Here's a caption for it.
Processor: One semiconductor chip, the CPU (central processing unit) seated in one socket, circa 1950s-2010s. Over time, more functions have been packed onto the CPU chip. Prior to the 1950s releases of single-chip processors, one processor might have spread across multiple chips. In the mid 2010s the system-on-a-chip chips made it slightly more sketchy to equate one processor to one chip, though that's generally what people mean by processor, as in "this computer has an i7 processor" or "this computer system has four processors."
Core: One block of a CPU, executing one instruction at a time. (You'll see people say one instruction per clock cycle, but some CPUs use multiple clock cycles for some instructions.)
What could be a typical or real problem for using parallel programming? It can be quite challenging to implement. On the internet they explain how to use it but not why.
Performance is the most common reason to use parallel programming. But: Not all programs will become faster by using parallel programming. In most cases your algorithm consists of parts that are parallelizable and parts, that are inherently sequential. You always have to reason about the potential performance gain of using parallel programming. In some cases the overhead for using it will actually make your program slower. Have a look at Amdahl's law to learn more about the potential performance improvements you can reach.
If you only want some examples of usage of parallel computations: There are some classes of algorithms that are inherently parallel, see this article the dwarfs of berkeley
Another reason for using a multithreaded application architecture is it's responsiveness. There are certain functions which block program execution for a certain amount of time, i.e. reads from files, network, waiting for user inputs, etc. While waiting like this does not consume CPU power, it often blocks or slows program flow.
Using threads in such case is simply a good practice to make the code clearer. Instead of using (often complex or unintuitive) checks for inputs, integrating those checks into program flow, manual switching between handling input and other tasks, a programmer may choose to use threads and let one thread wait for input, and the other i.e. to perform calculations.
In other words, multiple threads sometimes allow for better use of different resources at your computer's disposal: network, disk, input devices or simply monitor.
Generalization: using multiple threads (including parallel data processing) is advisable when the speed and responsiveness gains outweigh the synchronization costs and work required to parallelize the application.
The reason why there is increased interest in parallel programming is partly because the hardware we use today is more parallel. (multicore processors, many-core GPU). To fully benefit from this hardware you need to program in parallel.
Interestingly, parallel processing also improves battery life:
Having 4 cores at 1Ghz draws less power than one single core at 4Ghz.
A phone with a multicore CPU will try to run as much tasks as possible simultaneously, so it can turn off the CPU when all work is done. This is sometimes called "the rush to idle".
Now, some programs are more easy parallelize than others. You should not randomly try to parallelize your entire code base. But it can be a useful excersise to do so even if there is no business reason: then you will be more ready the day when you really need it.
There are very few problems which can't be solved more quickly by a parallel program than by a serial program. There are very few computers which do not have multiple processing units.
I conclude, therefore, that you should use parallel programming all the time.
what is the difference between parallel processing and multi core processing
Parallel and multi-core processing both refer to the same thing: the ability to execute code at the same time (in more than one core/CPU/machine.) So in this sense multi-core is just a means to do parallel processing.
On the other hand, concurrency (which is probably what you mean by parallel processing) refers to having multiple units of execution (threads or processes) that are interleaved. This can also happen in either in a single core CPU or in many cores/CPUs or even in many machines (clusters).
Summing up, multicore is a subset of parallel and concurrency can occur with or without parallelism. The field that studies this is distributed systems or distributed computing.
Parallel processing just refers to a program running more than 1 part simultaneously, usually with the different parts communicating in some way. This might be on multiple cores, multiple threads on one core (which is really simulated parallel processing), multiple CPUs, or even multiple machines.
Multicore processing is usually a subset of parallel processing.
Multicore processing means code working on more than one "core" of a single CPU chip. A core is like a little processor within a processor. So making code work for multicore processing will nearly always be talking about the parallelization aspect (though would also include removing any core specific assumptions, which you shouldn't normally have anyway).
As far as an algorithm design goes, if it is correct in a parallel processing point of view, it will be correct multicore.
However, if you need to optimise your code to get it to run as fast as possible "in parallel" then the differences between multicore, multi-cpu, multi-machine, or vectorised will make a big difference.
Parallel processing can be done inside a single core with multiple threads.
Multi-Core processing means distributing those threads to make use of the multiple cores in a CPU.