OpenGL "out of memory" on glReadPixels() - windows

I am running into an "out of memory" error from OpenGL on glReadPixels() under low-memory conditions. I am writing a plug-in to a program that has a robust heap mechanism for such situations, but I have no idea whether or how OpenGL could be made to use it for application memory management. The notion that this is even possible came to my attention through this [albeit dated] thread on a similar issue under Mac OS [not X]: http://lists.apple.com/archives/Mac-opengl/2001/Sep/msg00042.html
I am using Windows XP, and have seen it on multiple NVidia cards. I am also interested in any work-arounds I might be able to relay to users (the thread mentions "increasing virtual memory").
Thanks, Sean

I'm quite sure that the out-of-memory error is not raised from glReadPixels (infact glReadPixels doesn't allocate memory itself).
The error is probably raised by other routines allocating buffer objects or textures. Once you detect the out-of-memory error, you should release all non-mandatory buffer objects (textures, texture mipmaps, rarely used buffer objects) is order to allocate a new buffer object holding the glReadPixels returned data.

Without more specifics, hard to say. Ultimately OpenGL is going to talk to the native OS when it needs to allocate. So if nothing else, you can always replace (or hook) the default CRT/heap allocator for your process, and have it fetch blocks from the "more robust" heap manager in the plugin host.

Related

Is Xcode's debug navigator useless?

I am building an app in Xcode and am now deep into the memory management portion of the project. When I use Allocations and Leaks I seem to get entirely different results from what I see in Xcode's debug panel: particularly the debug panel seems to show much higher memory usage than what I see in Allocations and it also seems to highlight leaks that as far as I can tell (1) do not exist and (2) are confirmed to not exist by the Leaks tool. Is this thing useless, or even worse, misleading?
Here was a new one: today it told me I was using >1 GB of memory but its little memory meter read significantly <1 GB (and was still wrong if the Allocations data is accurate). Picture below.
UPDATE: I ran VM Tracker in a 38-minute session and it does appear virtual memory accounts for the difference between allocations / leaks and the memory gauge. Picture below. I'm not entirely sure how to think about this yet. Our game uses a very large number of textures that are swapped. I imagine this is common in most games of our scale (11 boards, 330 levels; each board and map screen has unique artwork).
You are probably using the Memory Gauge while running in the Simulator using a Debug build configuration. Both of those will give you misleading memory results. The only reliable way to know how memory is being managed is to run on a device using a Release build. Instruments uses the Release build configuration, so it's already going to be better than just running and using the Memory Gauge.
Moreover, it is a known flaw that the Xcode built-in memory tools, such as the Memory Debugger, can generate false positives for leaks.
However, Instruments has its flaws as well. My experience is that, for example, it fails to catch leaks generated during app startup. Another problem is that people don't always understand how to read its output. For example, you say:
the debug panel seems to show much higher memory usage than what I see in Allocations
Yes, but Allocations is not the whole story. You are probably failing to look at the VM allocations. Those are shown separately and often constitute the reason for high memory use (because they include the backing stores for images and the view rendering tree). The Memory Gauge does include virtual memory, so this alone might account for the "difference" you think you're seeing.
So, the answer to your question is: No, the Memory Gauge is not useless. It gives a pretty good idea of when you might need to be alert to a memory issue. But you are then expected to switch to Instruments for a proper analysis.

What are the common causes of Prefetch Abort errors on ARM-based devices?

My C program is running on bare metal Raspberry Pi 3B+. It's working fine except I got random freezes that are reported as Prefetch Abort by the CPU itself. The device may work fine for hours then suddently crash. It's doing nothing special before it crashes, so it's not predictable.
The FS register (FSR) is set to 0xD when this error happens, which tells it's a Permission Error : http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0087e/Cihhhged.html
Other registers : FAR is 0xE80000B6, LR is 0xFFFFFFFF, PC is 0xE80000B6, PSR is 0x200001F1
My program uses FIQ and IRQ interrupts, and use the all the four cpu cores.
I don't ask for specific debug here since it would be too complicated to dive into the details, but are you aware of common causes for Prefetch Errors to happen ?
Given that your code is multi-threaded (multi-core, indeed) and the crash is not predictable, I'd say that the prefetch abort is almost certainly being caused by memory corruption due to a race.
It might help if you can find out where the abort is being generated. Bugs like this can be extremely hard to track down though; if the code always crashes in the same place then that could help, but even if it does and you can find out which address is suffering corruption, monitoring that address for rogue writes without affecting the timing of the program (and hence the manifestation of the bug) is essentially impossible.
It is quite likely that the root cause is a buffer overrun, especially given your comments above. Really you should know in advance how big your buffers will need to be, and then make them that size. If whatever algorithm you're using can't guarantee a limit on the amount of buffer it uses, you should add code that performs a runtime check on the buffer and responds appropriately (perhaps a nicely reported error so you know which buffer is overflowing). Using the heap is ok but declaring a large buffer as static is faster and leak-free, providing the function containing the buffer is non-reentrant.
If there are data access races in the mix too, note that you will need more than data barrier instructions to solve these. The data barrier instructions only address consistency problems related to pending memory transactions. They don't prevent register caching of shared data (you need the volatile keyword for that) or simultaneous read-modify-write races (you need mutual exclusion mechanisms for that, either as provided by whatever framework you're using or home-brewed using the STREX and LDREX instructions on armv7).

when does an operating system wipes out memory of a process

A process is terminated successfully or abnormally on some OS, when does an OS decide to wipe out the memory (data, code etc.) allocated to that process; at exit or when it wants to allocate memory to a new process?
And is this wiping out memory allocation procedure the same on all operating systems (winXP, Win7, linux, Mac)?
I understand, page table has mapping of virtual addresses for that process and actual physical addresses in the memory.
Thanks.
How an OS reclaims process resources can (and generally does) vary by OS. On the Windows side of things, the NT-derived OSes behave similarly, so there should be little difference between win XP and win7. Note that asking about 'memory' is an over-simplification in this context, because there are different types of memory. For example, a typical Windows application will have stack memory, heap memory (sometimes multiple heaps), instruction/static memory, and perhaps shared memory. Most of this memory is solely owned by the process, and Windows will reclaim it on process termination (even abnormal termination).
However, shared memory can (and often does) have multiple owners; it is tied to a Windows handle (a kernel-level object that can potentially be referenced from multiple processes). Handles have a reference count, and the associated resource is reclaimed if the reference count goes to zero. This means that shared memory can outlive a process that references it. Also, it is possible for a process to 'leak' a handle, and for the handle to never be reclaimed. It is the programmer's responsibility to make sure such handles are properly closed and don't leak; the possibility of abnormal termination complicates this responsibility.
On a side note, when Windows 'reclaims' memory, it simply means that memory is available for future allocations to other processes, etc. The actual 1s and 0s are usually going to sit there up until the OS allocates the memory and the new owner of the memory actively over-writes it. So 'reclaiming' doesn't mean the memory is immediately zeroed out or anything like that; scrubbing the memory in this matter is inefficient and often unnecessary. If you are asking out of security concerns, you shouldn't rely on the OS; you'll need to scrub the memory yourself before your process releases it back to the OS.
If you want to know more about how modern Windows OSes handle memory, and don't mind doing some digging, the Windows API documentation on MSDN has a lot of information on the subject, but it is a little bit scattered. Good places to start would probably be Windows Handles, and load/unload library/process calls. Application Programming for Windows (Richter) probably has some decent information on this, if I remember right, but I don't have a copy on hand right now to check.
Hopefully, someone more knowledgeable about Linux internals can address that side of the question. This is OS-specific stuff, so there are likely to be differences. It might be worth noting that pre-NT Windows (e.g. Windows 95,98,etc.) had a completely different model of process memory. Those differences tended to make it harder for the OS to reclaim memory in the case of abnormal termination; some users found a need to restart the OS frequently if they were running unstable applications, in order to clean up the accumulated memory leak.
In Linux the resources are normally freed upon termination of the process. You can read about how Linux handles process termination here: http://www.informit.com/articles/article.aspx?p=370047&seqNum=4
There is also the OOM killer which can kick-in in extreme low memory situations I know in the embedded Android world stuff has been happening aorund this but I haven't really kept on top of it LWN.net probably has some coverage.

How does the size of managed code affect memory footprint?

I have been tasked with reducing memory footprint of a Windows CE 5.0 application. I came across Rob Tiffany's highly cited article which recommends using managed DLL to keep the code out of the process's slot. But there is something I don't understand.
The article says that
The JIT compiler is running in your slot and it pulls in IL from the 1
GB space as needed to compile the current call stack.
This means that all the code in the managed DLL can potentially eventually end up in the process's slot. While this will help other processes by not loading the code in common area how does it help this process? FWIW the article does mention that
It also reduces the amount of memory that has to be allocated inside your
My only thought is that just as the code is pulled into the slot it is also pushed/swapped out. But that is just a wild guess and probably completely false.
CF assemblies aren't loaded into the process slot like native DLLs are. They're actually accessed as memory-mapped files. This means that the size of the DLL is effectively irrelevant.
The managed heap also lies in shared memory, not your process slot, so object allocations are far less likely to cause process slot fragmentation or OOM's.
The JITter also doesn't just JIT and hold forever. It compiles what is necessary, and during a GC may very well pitch compiled code that is not being used, or that hasn't been used in a while. You're never going to see an entire assembly JITTed and pulled into the process slow (well if it's a small assembly maybe, but it's certainly not typical).
Obviously some process slot memory has to be used to create some pointers, stack storage, etc etc, but by and large managed code has way less impact on the process slot limitations than native code. Of course you can still hit the limit with large stacks, P/Invokes, native allocations and the like.
In my experience, the area people get into trouble most often with CF apps an memory is with GDI objects and drawing. Bitmaps take up a lot of memory. Even though it's largely in shared memory, creating lots of them (along with brushes, pens, etc) and not caching and reusing is what most often give a large managed app memory footprint.
For a bit more detail this MSDN webcast on Compact Framework Memory Management, while old, is still very relevant.

Troubleshooting ERROR_NOT_ENOUGH_MEMORY

Our application is failing on one specific user's computer with ERROR_NOT_ENOUGH_MEMORY ("Not enough storage is available to process this command").
The error is apparently being raised somewhere deep within the Delphi VCL framework that we're using, so I'm not sure which Windows API function is responsible.
Is memory a problem? A call to GlobalMemoryStatus gives the following information:
dwTotalPhys - 1063150000 (~1 GB)
dwAvailPhys - 26735000 (~27 MB)
dwAvailPage - 1489000000 (~1.4 GB)
It seems strange to me that Windows would let the available physical memory get so low when so much space is available in the paging file, but I don't know enough about Windows' virtual memory management to know if this is normal or not. Is it?
If not memory, then which resource limit is being hit? From what I read online, ERROR_NOT_ENOUGH_MEMORY could be the result of the application hitting any of several limits (GDI objects, USER objects, handles, etc.) and not necessarily memory. Is there a comprehensive list of what limits Windows enforces? Is there any way to find out which limit is being hit? I tried Google, but I couldn't find any systematic overview.
A more common cause this error than any of those you've listed is fragmentation of Virtual Memory Space. This a situation where whilst the total free memory is quite reasonable the free space is fragmented with various bits of the virtual memory space being currently allocated. Hence you can get an out of memory error when a request for memory cannot be satisfied by a single contiguous block despite the being enough in total free.
Check all the possibilities.
GDI-problems can be monitored using the free GDIView utility. Its a single file that users can start without a installer.
Also, install the ProcessExplorer on the machine concerned.
If you have no access to the machine, ask the user to make screenshots of the status monitored by the applications. Very likeley, this will give you some hint.
The culprit in this case was CreateCompatibleBitmap. Apparently Windows may enforce fairly strict systemwide limits on the memory available for device-dependent bitmaps (see, e.g, this mailing list discussion), even if your system otherwise has plenty of memory and plenty of GDI resources. (These systemwide limits are apparently because Windows may allocate device-dependent bitmaps in the video card's memory.)
The solution is simply to use device-independent bitmaps (DIBs) instead (although these may not offer quite as good of a performance). This KB article describes how to pick the optimal DIB format for a device.
Other candidates for resource limits (from others' answers and my own research):
GDI resources (from this answer) - easily checked with GDIView
Virtual memory fragmentation (from this answer)
Desktop heap - see here or here
My answer may be a little bit late but, from my late experience with that same issue, doing all the tests, going step by step, creating DC, releasing it, using DIBSection instead of CompatibleBitmap, using leak GDI/Memory tools, etc.
In the end (LOL) I found that:
I was switching the priority of these two calls, then the whole problem was fixed.
DeleteDC(hdc); //do it first (always before deleting objects)
DeleteObject(obj);

Resources