does opencl release all device memory after process termination?

does opencl release all device memory after process termination? - caching

On Linux I used to be sure that whatever resources a process allocates, they are released after process termination. Memory is freed, open file descriptors are closed. No memory is leaked when I loop starting and terminating a process several times.
Recently I've started working with opencl.
I understand that the opencl-compiler keeps compiled kernels in a cache. So when I run a program that uses the same kernels like a previous run (or probably even those from another process running the same kernels) they don't need to be compiled again. I guess that cache is on the device.
From that behaviour I suspect that maybe allocated device-memory might be cached as well (maybe associated with a magic cookie for later reuse or something like that) if it was not released explicitly before termination.
So I pose this question to rule out any such suspicion.
kernels survive in chache => other memory-allocations survive somehow ???

My short answer would be yes based on this tool http://www.techpowerup.com/gpuz/
I'm investigating a memory leak on my device and I noticed that memory is freed when my process terminates... most of the time. If you have a memory leak like me, it may linger around even after the process is finished.
Another tool that may help is http://www.gremedy.com/download.php
but its really buggy so use it judiciously.

Related

When does OOM gets invoked to kill some process in linux kernel version 5.x.x?

Is it possible for a Linux kernel OOM killer to kill a process even if there is enough swap space available?
This link in section 13.2 suggests that if there is a swap space available then OOM killer will not kill a process.
Now, I made changes to the Linux kernel and stopped the swapping of anonymous pages entirely, and consequently, there is always a free swap space available. But still, I observe OOM killing processes.
However, this documentation may be outdated. Can someone provide insights into what are the checks performed before OOM really kills a process in the latest Linux kernel(v-5.x.x)?

Effects of not freeing memory in a Windows application?

I have an MFC C++ application that usually runs constantly in the system tray.
It allocates a very extensive tree of objects in memory, which causes the application to take several seconds to free, when the application needs to shutdown.
All my objects are allocated using new and typically freed using delete.
If I just skip deleting all the objects, in order to quit faster, what are the effects if any?
Does Windows realize the process is dead and reclaims the memory automatically?
I know that not freeing allocated memory is almost sacrilegious, but thought I would ask to see what everyone else thinks.
The application only shuts down when either the users system shuts down, or if they choose to shut the program down themselves.

When a process terminates the system will reclaim all resources. This includes releasing open handles to kernel objects and allocated memory. If you do not free memory during process termination it has no adverse effect on the operating system.
You will find substantial information about the steps performed during process termination at Terminating a process. With respect to your question the following is the relevant section:
Terminating a process has the following results:
...
Any resources allocated by the process are freed.
You probably should not skip the cleanup step in your debug builds though. Otherwise you will not get memory leak diagnostics for real memory leaks.

What happens to shared memory if one of the process sharing the memory is killed?

I was working on shared memory and this question came in my mind so thought of asking from experts:
What happens to the shared memory if one of the process sharing the memory is killed? What happens if we do hard-kill rather than normal-kill?
Is it dependent on the mechanism we use for shared memory?
If it matters, I am working on Windows.

Provided at least one other thread in another process has an open handle to the file mapping, I would expect the shared memory to remain intact.

CUDA/PyCUDA: Diagnosing launch failure that disappears under cuda-gdb

Anyone know likely avenues of investigation for kernel launch failures that disappear when run under cuda-gdb? Memory assignments are within spec, launches fail on the same run of the same kernel every time, and (so far) it hasn't failed within the debugger.
Oh Great SO Gurus, What now?

cuda-gdb spills all shared memory and registers to local memory. So when something runs ok built for debugging and fails otherwise, it usually means out of bounds shared memory access. cuda-memcheck might help, depending on what sort of card you are using. Fermi is better than older cards in that respect.
EDIT:
Casting my mind back to the bad old days, I remember having an ornery GT9500 which used to throw similar NV13 errors and have random code failures when running very memory intensive kernels with a lot of shared memory activity. Never when debugging. I put it down to bad hardware and moved on to a GT200, never to see a similar error since. One possibility might be bad hardware. Is this a G92 (9800GT or similar)?

CUDA GDB can make some of the cuda operations synchronous.
Are you reading from a memory after has been initialized ?
are you using Streams?
Are you launching more than one kernel?
Where and how does it fail ?

How can I use up RAM quickly to test garbage collection?

Windows Server 2008. How can I quickly use up RAM so to induce GC in my app. If there is a way to do it without needing Visual Studio or installing a language runtime it would be good.
EDIT: I don't want to have to write an app and then copy it over to the server. I'm looking for a way to do it quickly without writing an app that requires an IDE or installation of a runtime/compiler.
Perhaps a powershell or batch script?...

I don't think using up RAM outside your process is going to necessarily trigger GC.
If I understand your question correctly, you have a program Foo.exe that is written in some unknown language, running on some unknown runtime (are you not allowed to post the details for some reason, or do you just not know?), and you want to try to get that program's runtime to trigger a garbage collection. However, you want to do this by using up RAM outside of foo.exe.
You could do this by creating a simple batch file that just started up a hundred copies of IE or Word or whatever program you want. However, I don't think that will do what you want it to do. If your process has already allocated a certain amount of memory, it won't necessarily give that memory up or trigger GC just because other processes are being started. It may page to disk, or may force other programs to page to disk. But not all Garbage Collectors are alike, so we can't really help without more details. I'm pretty sure some VM's never give back memory once they've allocated it, even after GC.

You could run your program inside a virtual machine such as Virtual Box, where you specify the memory ceiling of the guest operating system.
I'm having trouble imagining a scenario where this would be necessary though. Could you provide more information about the problem?

If you are using java you can specify the max amount of memory using Xmx. Search for JVM memory setting

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

does opencl release all device memory after process termination? - caching

Related

When does OOM gets invoked to kill some process in linux kernel version 5.x.x?

Effects of not freeing memory in a Windows application?

What happens to shared memory if one of the process sharing the memory is killed?

CUDA/PyCUDA: Diagnosing launch failure that disappears under cuda-gdb

How can I use up RAM quickly to test garbage collection?

Categories

Resources