I'm thinking about upgrading to Go 1.4 but am concerned because I no longer know how to change the max amount of memory I can address.
I have been using Go to run some machine learning experiments on a large server, 512GB of main memory, which makes the 128GB limit set using a 37 bit address insufficient.
Previously I would edit malloc.h in the runtime package to change to 38 bit addresses but with the conversion from C to Go of the source I'm having difficulty finding if there is still something as simple to modify.
This commit that changed max memory to 128GB from 16GB shows the kind of change I am talking about https://code.google.com/p/go/source/detail?r=a310cb32c278
So I realized I did not find the file because I am not use to using the Google Code repo explorer. I located what are now 3 malloc.go files and have found the relevant section of code.
https://code.google.com/p/go/source/browse/src/runtime/malloc2.go#122
Honestly I think the update using 1 and 0 booleans and multiplication rather than simple if statements is overly confusing and doesn't convey what is going on as clearly as the header file used to.
Also thank you bamboon I only realized my mistake after reading the mailing list and getting linked to a different repo viewer.
Related
My program detects a memory leak when closed and I'd like to resolve it. I've replaced my only usage of new with a smart pointer, but I am using quite a few bare char[] and can't really afford the time to go through each pointer I'm using in the project.
Can I identify the cause of a memory leak from this heap snapshot?
Is there a way to download this file and search through it for that particular memory location?
Am I wasting my time trying to diagnose this issue because Windows will clean up the memory anyways?
Can I identify the cause of a memory leak from this heap snapshot?
Typically not. You would need at least 2 snapshots and compare them to find the difference. You would then walk through the code that was run between these 2 snapshots and see where the difference comes from.
This requires a lot of discipline and closely following the steps of memory analysis:
Perform the steps in question once. This will ensure that any lazy initialization is done.
Take a snapshot
Perform the steps in question again. Make sure you return to the same point as before.
Take a snapshot
Compare the snapshots
The result may still include false positives, e.g. output in the log window (20 lines before, 40 lines after).
Is there a way to download this file and search through it for that particular memory location?
I don't know if that's built into Visual Studio. But you could of course create a crash dump file with full memory and then look it up there.
Am I wasting my time trying to diagnose this issue because Windows will clean up the memory anyways?
IMHO you're not wasting your time. An application may crash and cause data loss for the user if it has a memory leak and runs out of address space. You'll find the problem and become a better programmer. You'll learn something which helps you also with other programming languages.
But other than that, you're right: Windows will free the memory when the process terminates. So restarting the program will likely help and that may do as a workaround for customers for a period of a few months.
This is a simple program to decompress the string, I just running a loop to show that memory usage increases and the memory used never get released.
Memory is not getting released even after 8hr also
Package for decompressing string: https://github.com/Albinzr/lzGo - (simple lz string algorithm)
I'm adding a gist link since the string used for decompressing is large
Source Code:
Code
Activity Monitor
I'm completely new to go, Can anyone tell me how I can solve the memory issue?
UPDATE Jul 15 20
The app still crashes when the memory limit is reached Since it only uses 12mb - 15mb this should not happen!!
There is a lot going on here.
First, using Go version 1.14.2 your program works fine for me. It does not appear to be leaking memory.
Second, even when I purposely created a memory leak by increasing the loop size to 100 and saving the results in an array, I only used about 100 MB of memory.
Which gets us to Third, you should not be using Activity Monitor or any other operating system level tools to be checking for memory leaks in a Go program. Operating System memory management is a painfully complex topic and the OS tools are designed to help you determine how a program is affecting the whole system, not what is going on within the program.
Specifically, macOS "Real Memory" (analogous to RSS, Resident Set Size) includes memory the program is no longer using but the OS has not taken back yet. When the garbage collector frees up memory and tells the OS it does not need that memory anymore, the OS does not immediately take it back. (Why it works that way is way beyond the scope of this answer.)
Also, if the OS is under Memory Pressure, it can take back not only memory the program has freed, but it can also take back (temporarily) memory the program is still using but has not accessed "recently" so that another program that urgently needs memory can use it. In this case, "Real Memory" will be reduced even if the process is not actually using less memory. There is no statistic reported by the operations system that will help you here.
You need to use native Go settings like GODEBUG=gctrace=1 or tools like expvar and expvarmon to see what the garbage collector is doing.
As for why your program ran out of memory when you limited it, keep in mind that by default Go builds a dynamically linked executable and just reading in all the shared libraries can take up a lot of memory. Try building your application with static linking using CGO_ENABLED=0 and see if that helps. See how much memory it uses when you only run 1 iteration of the loop.
I see many articles suggesting not to map huge files as mmap files so the virtual address space won't be taken solely by the mmap.
How does that change with 64 bit process where the address space dramatically increases?
If I need to randomly access a file, is there a reason not to map the whole file at once? (dozens of GBs file)
On 64bit, go ahead and map the file.
One thing to consider, based on Linux experience: if the access is truly random and the file is much bigger than you can expect to cache in RAM (so the chances of hitting a page again are slim) then it can be worth specifying MADV_RANDOM to madvise to stop the accumulation of hit file pages steadily and pointlessly swapping other actually useful stuff out. No idea what the windows equivalent API is though.
There's a reason to think carefully of using memory-mapped files, even on 64-bit platform (where virtual address space size is not an issue). It's related to the (potential) error handling.
When reading the file "conventionally" - any I/O error is reported by the appropriate function return value. The rest of error handling is up to you.
OTOH if the error arises during the implicit I/O (resulting from the page fault and attempt to load the needed file portion into the appropriate memory page) - the error handling mechanism depends on the OS.
In Windows the error handling is performed via SEH - so-called "structured exception handling". The exception propagates to the user mode (application's code) where you have a chance to handle it properly. The proper handling requires you to compile with the appropriate exception handling settings in the compiler (to guarantee the invocation of the destructors, if applicable).
I don't know how the error handling is performed in unix/linux though.
P.S. I don't say don't use memory-mapped files. I say do this carefully
One thing to be aware of is that memory mapping requires big contiguous chunks of (virtual) memory when the mapping is created; on a 32-bit system this particularly sucks because on a loaded system, getting long runs of contiguous ram is unlikely and the mapping will fail. On a 64-bit system this is much easier as the upper bound of 64-bit is... huge.
If you are running code in controlled environments (e.g. 64-bit server environments you are building yourself and know to run this code just fine) go ahead and map the entire file and just deal with it.
If you are trying to write general purpose code that will be in software that could run on any number of types of configurations, you'll want to stick to a smaller chunked mapping strategy. For example, mapping large files to collections of 1GB chunks and having an abstraction layer that takes operations like read(offset) and converts them to the offset in the right chunk before performing the op.
Hope that helps.
The purpose of the VirtualLock WinAPI call is to lock pages into the working set of a process. However, the WorkingSet64 API inexplicably doesn't count those pages.
Possibly as a result of this, neither Process Explorer nor the standard Task Manager count locked pages in their per-process memory usage statistics.
What's up with this? Could someone intimately familiar with virtual memory in WinNT shed some light on this inconsistency, which can cause gigabytes of used RAM to go essentially undetected? (think of SQL Server or VirtualBox)
Ah, that is easily explained: You're using the wrong API. GetProcessWorkingSetSize queries the minimum and maximum working set sizes. Those are quotas, not acutal values.
The minimum working set size is what Windows will guarantee to keep locked in RAM as long as the world does not end. The maximum working set size is the amount of memory that Windows will allow your process before pages are moved into the pool (they are not necessarily gone, but accessing them causes a fault and re-mapping).
You want GetProcessMemoryInfo
EDIT:
Since it is now clear that you were not using the wrong API (only named the wrong func), I've done some testing (VirtualAlloc and memory mapped files, both in combination with VirtualLock) on my XP system. At first sight, it looked like you are totally right. Allocating 512MB or memory mapping 512MB out of a 650MB file added 512MB to the virtual size but did not increase the working set. Following with a VirtualLock(512MB) did not affect the working set at all!
Then it occurred to me that VirtualLock took exactly zero time in every case, which did not seem plausible e.g. for having to fetch half a gigabyte from disk. So, I checked the return code and guess what. Windows doesn't think that locking 512MB is a good idea, and will refuse to do it.
Repeated the experiment with only 64MB, and behold, the working set immediately went up by 64MB, just as it should. So, in one word: "works for me".
Just to be sure, you did check the return code?
On a second look, this behaviour is even well-defined and well-documented. The docs to VirtualLock state explicitly:
The maximum number of pages that a
process can lock is equal to the
number of pages in its minimum working
set minus a small overhead.
With and without locking, after appropriately setting the WS quotas:
VirtualBox is a different matter, what you see in the task manager is only the working set of the "Interface" program and "Manager" frontend, both of which maintain working set sizes of below 64M at all times. Though I'm not sure what memory it maybe allocates in some drivers, or if they lock memory at all.
I'm currently running 2 virtual machines with 1.6GB main memory each. Seeing how my 32-bit Windows only sees 3.25GB, that would leave a mere 50MB for if the memory belonging to the VMs is locked. Besides, Process Explorer tells me that Firefox alone has a working set of 474MB and going up while I'm typing this (holy...?!!). That does not make it likely that all the memory in the virtual machines is really locked, because such figures would be entirely impossible then.
As requested, here's a shot of VMMap:
The figures are admittedly funny... the VM has 1.6M total of which according to VMMap 821MiB are reserved and 772MiB are committed, Process Explorer only shows 163MiB and 54MiB, respectively. Something is definitively fishy there, but I suspect this is probably some obscure VirtualBox hackery rather than a Windows issue.
I'm developing a simple little toy OS in C and assembly as an experiment, but I'm starting to worry myself with my lack of knowledge on system memory.
I've been able to compile the kernel, run it in Bochs (loaded by GRUB), and have it print "Hello, world!" Now I'm off trying to make a simple memory manager so I can start experimenting with other things.
I found some resources on memory management, but they didn't really have enough code to go off of (as in I understood the concept, but I was at a loss for actually knowing how to implement it).
I tried a few more or less complicated strategies, then settled with a ridiculously simplistic one (just keep an offset in memory and increase it by the size of the allocated object) until the need arises to change. No fragmentation control, protection, or anything, yet.
So I would like to know where I can find more information when I do need a more robust manager. And I'd also like to learn more about paging, segmentation, and other relevant things. So far I haven't dealt with paging at all, but I've seen it mentioned often in OS development sites, so I'm guessing I'll have to deal with it sooner or later.
I've also read about some form of indirect pointers, where an application holds a pointer that is redirected by the memory manager to its real location. That's quite a ways off for me, I'm sure, but it seems important if I ever want to try virtual memory or defragmentation.
And also, where am I supposed to put my memory offset? I had no idea what the best spot was, so I just randomly picked 0x1000, and I'm sure it's going to come back to me later when I overwrite my kernel or something.
I'd also like to know what I should expect performance-wise (e.g. a big-O value for allocation and release) and what a reasonable ratio of memory management structures to actual managed memory would be.
Of course, feel free to answer just a subset of these questions. Any feedback is greatly appreciated!
If you don't know about it already, http://wiki.osdev.org/ is a good resource in general, and has multiple articles on memory management. If you're looking for a particular memory allocation algorithm, I'd suggest reading up on the "buddy system" method (http://en.wikipedia.org/wiki/Buddy_memory_allocation). I think you can probably find an example implementation on the Internet. If you can find a copy in a library, it's also probably worth reading the section of The Art Of Computer Programming dedicated to memory management (Volume 1, Section 2.5).
I don't know where you should put the memory offset (to be honest I've never written a kernel), but one thing that occurred to me which might work is to place a static variable at the end of the kernel, and start allocations after that address. Something like:
(In the memory manager)
extern char endOfKernel;
... (also in the memory manager)
myOffset = &endOfKernel;
... (at the end of the file that gets placed last in the binary)
char endOfKernel;
I guess it goes without saying, but depending on how serious you get about the operating system, you'll probably want some books on operating system design, and if you're in school it wouldn't hurt to take an OS class.
If you're using GCC with LD, you can create a linker script that defines a symbol at the end of the .BSS section (which would give you the complete size of the kernel's memory footprint). Many kernels in fact use this value as a parameter for GRUB's AOUT_KLUDGE header.
See http://wiki.osdev.org/Bare_bones#linker.ld for more details, note the declaration of the ebss symbol in the linker script.