Here is a link to a fiddle with what I think is a memory leak I can't get around. I've spent almost all of today trying to find what this memory leak was and explored a lot of three js and googled everything but I can't find the answer to what is going on. I guess geometry.dispose or material.dispose should help here but they don't seem to.
Do I somehow still have reference to all of the meshes/geometries/materials
somewhere even though I can't see them?
If you are wondering why I am making so many blocks instead of reusing old ones, it is out of convenience and creating new blocks doesn't slow down my code noticeably, so I don't see why I shouldn't be able to. I just don't understand why my memory never seems to be released and eventually chrome will freeze/crash.
If I never discover a solution I plan to just reuse old blocks, I don't think it will be so bad, we will see.
I ended up solving my memory leak by reusing the geometry and material. I don't know how creating new geometries/materials creates a memory leak but it really made a huge difference.
Related
I've been playing around with a simple raytracer in go that has been working pretty neatly so far. I'm using multiple goroutines to render different parts of the image, which then place their results into a shared film.
Against my expectations, my go code is still about 3 times slower than equivalent java code. Was that to be expected? Further, when inspecting the CPU-Usage in htop, I discovered that every core is only used to about 85%. Is that an issue with htop or is there a problem with my code? Here is the cpu profile of my application
I did set GOMAXPROCS as runtime.GOMAXPROCS(runtime.NumCPU()). The full code is on github.
I would guess that garbage collector is the problem. Maybe you are making a lot of unnecessary allocations. By using runtime.ReadMemStats you can find out how much time garbage collector has been running.
If this is the case then you must find a way to reduce memory allocations. By using pools of objects for example. Look at sync.Pool. Also there are few useful links that you can find via Google that explain how to reduce memory allocation. Look at this one for example.
I've got an iMac whose VRAM appears to have gone on the fritz. On boot, things are mostly fine for a while, but eventually, as more and more windows are opened (i.e. textures are created on the GPU), I eventually hit the glitchy VRAM, and I get these bizarre "noisy" grid-like patterns of red and green in the windows.
I had an idea, but I'm mostly a newb when it comes to OpenGL and GPU programming in general, so I figured I'd ask here to see if it was plausible:
What if I wrote a little app, that ran on boot, and would allocate GPU textures (of some reasonable quantum -- I dunno, maybe 256K?) until it consumed all available VRAM (i.e. can't allocate any more textures). Then have it upload a specific pattern of data into each texture. Next it would readback the texture from the GPU and checksum the data against the original pattern. If it checks out, then release it (for the rest of the system to use). If it doesn't checksum, hang onto it (forever).
Flaws I can see: a user space app is not going to be able to definitively run through ALL the VRAM, since the system will have grabbed some, but really, I'm just trying to squeeze some extra life out of a dying machine here, so anything that helps in that regard is welcome. I'm also aware that reading back from VRAM is comparatively slow, but I'm not overly concerned with performance -- this is a practical endeavor, to be sure.
Does this sound plausible, or is there some fundamental truth about GPUs that I'm missing here?
Your approach is interesting, although I think there other ways that might be easier to implement if you're looking for a quick fix or work-around. If your VRAM is on the fritz then it's likely that there is a specific location the corruption is taking place. If you're able to determine consistently that it happens at a certain point (VRAM is consuming x amount of memory, etc.) then you can work with it.
It's quite easy to create a RAM disk, and another possibility would be to allocate regular memory for VRAM. I know both of these are very possible, because I've done it. If someone says something "won't work" (no offense Pavel), it shouldn't discourage you from at least trying. If you're interested in the techniques that I mentioned I'd be happy to provide more info, however, this is about your idea and I'd like to know if you can make it work.
If you are able to write an app that ran on boot even before an OS loaded, that would be in the bootloader - why wouldnt you just then do a self-test of memory at that time ?
Or did you mean an userland app after the OS boots into the login ? An userland app will not be able to do what you mentioned of cycling through every address simply because there is no mapping to userland directly for every page.
If you are sure that RAM is a problem, did you try replacing the RAM ?
I am new to Python, writing something with pygame and it is very bitmap intensive. Here are certain (current) facts about it:
All graphics files have the potential to be reused at any point in a program instance.
It can take up 1GB+ memory if I pre-load everything in the beginning, even when there are no duplicates.
It is not hard to load the images when they are (almost) needed i.e. the file sizes are very small compared to the memory usage, and it is easy to predict what will come next.
There are many suggestions not to use del, and I do not know if that applies to my case. I have thought about utilizing the garbage collection mechanism, by implementing a resource manager that holds the only reference to any loaded image, and it juggles through different images, roughly by removing the reference for one while re-loading an other.
However, I am not very sure if this really frees any memory at any point, and I don't know how make the GC to actually keep the memory down consistently, as it seems that gc calls is quite expensive (and by default too infrequent)
So in summary, I would like to know whether the method outlined above is worth a try, and if not I hope someone could teach me other ways such as properly using del, and whether that fits pygame. Any help will be appreciated.
Try this, see if its good enough. http://www.pygame.org/wiki/LazyImageLoading?parent=CookBook
When you first reference an item in an ImageController instance, it is
loaded and returned. While a reference is kept to the image, it
remains available in the ImageController. When the image no longer has
any active references, it is removed from memory, and will be reloaded
next time it is referenced.
Keep your initial texture manager design as simple as possible. Afterwards, if profiling says you need more performance, then optimize.
How much memory leak is negligible?
In my program, I am using Unity and when I go through Profile > Leaks and work with the project it shows about total 16KB memory Leak caused by Unity and I cannot help that.
EDIT: After a long play with the program it amounts to a 400KB leak.
What should I do? Is this amount of memory leak acceptable for an iPad project?
It's not great, but it won't get your app rejected unless it causes a crash in front of a reviewer. The size is less important than how often it occurs. If it only occurs once every time the app is run, that's not a big deal. If it happens every time the user does something, then that's more of a problem.
It's probably a good idea for you to track down these bugs and fix them, because Objective C memory management is quite different compared to Java, and it's good to get some practice in with smaller stuff before you're stuck trying to debug a huge problem with a deadline looming.
First, look if you can use Unity in another way to circumvent the leak (if you have enough insight into the workings of this framework).
Then, report the leakage to the Unity developers if not already done (by you or someone else).
Third, if you absolutely rely on this framework, hope it get fixed ASAP, unless switching to another framework is an option for you.
A 400K leak is not a very big deal unless it amounts to that size within few minutes. Though, no matter how small the leak, it is always necessary to keep an eye on any leak caused by your or third party code and try to get rid of them in the next minor or major iteration of your app.
My application uses many critical sections, and I want to know which of them might cause high contention. I want to avoid bottlenecks, to ensure scalability, especially on multi-core, multi-processor systems.
I already found one accidentally when I noticed many threads hanging while waiting to enter critical section when application was under heavy load. That was rather easy to fix, but how to detect such high contention critical sections before they become a real problem?
I know there is a way to create a full dump and get that info from it (somehow?). But this is rather intrusive way. Are there methods application can do on the fly to diagnose itself for such issues?
I could use data from structure _RTL_CRITICAL_SECTION_DEBUG, but there are notes that this could be unsafe across different Windows versions: http://blogs.msdn.com/b/oldnewthing/archive/2005/07/01/434648.aspx
Can someone suggest a reliable and not too complex method to get such info?
What you're talking about makes perfect sense during testing, but isn't really feasible in production code.
I mean.. you CAN do things in production code, such as determine the LockCount and RecursionCount values (this is documented), subtract RecursionCount from LockCount and presto, you have the # of threads waiting to get their hands on the CRITICAL_SECTION object.
You may even want to go deeper. The RTL_CRITICAL_SECTION_DEBUG structure IS documented in the SDK. The only thing that ever changed regarding this structure was that some reserved fields were given names and were put to use. I mean.. it's in the SDK headers (winnt.h), documented fields do NOT change. You misunderstood Raymond's story. (He's partially at fault, he likes a sensation as much as the next guy.)
My general point is, if there's heavy lock contention in your application, you should, by all means, ferret it out. But don't ever make the code inside a critical section bigger if you can avoid it. And reading the debug structure (or even lockcount/recursioncount) should only ever happen when you're holding the object. It's fine in a debug/testing version, but it should not go into production.
There are other ways to handle concurrency besides critical sections (i.e semaphores). One of the best ways is non-blocking synchronization. That means structuring your code to not require blocking even with shared resources. You shoudl read up on concurrency. Also, you can post a code snippet here and someone can give you advice on how ways to improve your concurrecy code.
Take a look at Intel Thread Profiler. It should be able to help to spot such problems.
Also you may want to instrument your code by wrapping critical sections in a proxy that dumps data on the disk for analysis. It really depends on the app itself, but it could be at least the information how long thread been waiting for the CS.