Can GPU be used for a general programming? [closed] - performance

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
It seems like for special tasks GPU can be 10x or more powerful than the CPU.
Can we make this power more accessible and utilise it for common programming?
Like having cheap server easily handling millions of connections? Or on-the-fly database analytics? Map/reduce/Hadoop/Storm - like stuff with 10x throughput? Etc?
Is there any movement in such direction? Any new programming languages or programming paradigms that will utilise it?

CUDA or OpenCL are good implementations of GPU programming.
GPU programming uses Shaders to process input buffers and almost instantly generate result buffers. Shaders are small algorithms units, mostly working with float values, which contains their own data context (input buffers and constants) to produce results. Each Shader is isolated from the other Shaders during a task, but you can chain them if required.
GPU programming won't be good at handling HTTP requests since this is mostly a complex sequential process, but it will be amazing to process, for example, a photo or a neural network.
As soon as you can chunk your data into tiny parallel units, then yes it can help. The CPU will remain better for complex sequential tasks.
Colonel Thirty Two links to a long and interesting answer about this if you want more informations : https://superuser.com/questions/308771/why-are-we-still-using-cpus-instead-of-gpus

Related

Are there any computer viruses that affect gpus? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Recent developments in gpus (the past few generations) allow them to be programmed. Languages like Cuda, openCL, openACC are specific to this hardware. In addition, certain games allow programming shaders which function in the rendering of images in the graphics pipeline. Just as code intended for a cpu can cause unintended execution resulting a vulnerability, I wonder if a game or other code intended for a gpu can result in a vulnerability.
The benefit a hacker would get from targeting the GPU is "free" computing power without having to deal with the energy cost. The only practical scenario here is crypto-miner viruses, see this article for example. I don't know details on how they operate, but the idea is to use the GPU to mine crypto-currencies in the background, since GPUs are much more efficient than CPUs at this. These viruses will cause substential energy consumption if unnoticed.
Regarding an application running on the GPU causing/using a vulnerability, the use-cases here are rather limited since security-relevant data usually is not processed on GPUs.
At most you could deliberately make the graphics driver crash and this way sabotage other programs from being properly executed.
There already are plenty security mechanisms prohibiting reading other processes' VRAM etc., but there always is some way around.

How to write a GPU parallelization program that will run on any GPU? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have worked with Halide and Cuda. However, a technology like CUDA will only run on NVIDIA GPUs. OpenCL will also run on AMD cards but there is no real all-in-one solution as far as I know.
But software like for example Matlab runs on any OS, independently of which GPU is in there. I believe Matlab uses parallellization techniques to speed up calculations on matrices (or at least I hope so).
So how does one go about writing a piece of software that can use the GPU for parallellizing calculations without writing separate software for each possible type of GPU? Or is this actually the only way to go?
I'm not planning to write such an application any time soon, I just became curious after taking a course on the subject.
You seem to be wrong about matlab supporting any gpu it is uses cuda for nividea gpus
see : https://www.mathworks.com/solutions/gpu-computing.html
and: https://www.mathworks.com/matlabcentral/answers/336084-will-matlab-support-amd-gpu-in-future
To answer your question
It seems like the 2 options are:
OpenCL : https://www.khronos.org/opencl/
DirectCompute/compute-shader : https://learn.microsoft.com/en-us/windows/win32/direct3d11/direct3d-11-advanced-stages-compute-shader
OpenCL is cross platform and DirectCompute is windows only and build on DirectX

How is distributed memory parallelism handled in Rust? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
How is distributed memory parallelism handled in Rust? By that, I mean language constructs, libraries, or other features to handle computing on something like a cluster akin to what MPI provides C, but not necessarily using the same primitives or methodology. In the Rustonomicon, I see a discussion of threads and concurrency, but I don't see any discussion on parallelizing across multiple computers.
To the best of my knowledge, there isn't really anything built into the language for distributed computing (which is understandable, since that's arguably not really the language's major focus, or at least wasn't back in the day). I don't believe there's any particularly popular crate or another for distributed computing either. Actix is probably the only actor crate that has achieved any traction, and it supports HTTP, but I don't think it is targeted at HPC/supercomputer setups. You also definitely would want to check out Tokio, which seems to be pretty much the library for asynchronous programming in Rust, and is specifically targeted towards network IO operations.
At the present point in time, if you're looking to replicate MPI, my guess would be that your best bet is to use FFI to a C-based MPI library. It appears that there's been a handful of attempts to create bindings to MPI for Rust, but I'm not sure that any of them are particularly complete.

Parallel computing: from theory to practice [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I studied how to optimize algorithms for multiprocessor systems. Now I would understand in main lines how these algorithms can be transformed into code.
I know that exist some libraries MPI based that helps the developement of software portable to different type of systems, but is right the word "portable" that makes me confused: how the program can be authomatically adapted to an arbitrary number of processors at runtime, since this is an option of mpirun? How the software can decide the proper topology (mesh, hypercube, tree, ring, etc)? The programmer can specify the preferred topology through MPI?
you start the application with a fixed number of cores. Thus, you cannot automatically adapted to an arbitrary number of processors at runtime.
You can tune your software to the topology of your cluster. This is really advanced and for sure not portable. It only makes sense if you have a fixed cluster and are striving for the last bit of performance.
Best regards, Georg

Measuring performances and scalability of mpi programs [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want to measure scalability and performances of one mpi program I wrote. Till now I used the MPI_Barrier function and the stopwatch library in order to count the time. The thing is that the computation time depends a lot on the current use of my cpu and ram so all the time I get different results. Moreover my program runs on a virtual machine vmware which I need in order to use Unix.
I wanted to ask...how can I have an objective measure of the times? I want to see if my program has a good scalability or not.
In general, the way most people measure time in their MPI programs is to use MPI_WTIME since it's supposed to be a portable way to get the system time. That will give you a decent realtime result.
If you're looking to measure CPU time instead of real time, that's a very different and much more difficult problem. Usually the way most people handle that is to run their benchmarks on an otherwise quiet system.

Resources