Objects being retained by unreachable path in Chromium

Objects being retained by unreachable path in Chromium - performance

I have a JavaScript web app, and as I click around in it, the memory used by Chrome seems to creep up gradually over time.
I'm trying to trace what might be being retained, and I found a lot of a certain type of object (that was already one of my prime suspects for the leak).
The "heap snapshot" feature of Chromium looked like it might tell me what's actually retaining these objects, but it's being a bit unhelpful.
It looks like it's all hinging on one object that's being retained (and the others are all linked by parent/child lookups), but the thing that actually seems to be retaining it is not accessible:
I cleared the body (to eliminate retention by DOM elements) and removed the only global variable that referred to a Context, but I can't figure out why they're still hanging around.
Any idea what's going on here, and how to fix it?

Related

How to track/find out which userdata are GC-ed at certain time?

I've written an app in LuaJIT, using a third-party GUI framework (FFI-based) + some additional custom FFI calls. The app suddenly loses part of its functionality at some point soon after being run, and I'm quite confident it's because of some unpinned objects being GC-ed. I assume they're only referenced from the C world1, so Lua GC thinks they're unreferenced and can free them. The problem is, I don't know which of the numerous userdata are unreferenced (unpinned) on Lua side?
To confirm my theory, I've run the app with GC disabled, via:
collectgarbage 'stop'
and lo, with this line, the app works perfectly well long past the point where it got broken before. Obviously, it's an ugly workaround, and I'd much prefer to have the GC enabled, and the app still working correctly...
I want to find out which unpinned object (userdata, I assume) gets GCed, so I can pin it properly on Lua side, to prevent it being GCed prematurely. Thus, my question is:
(How) can I track which userdata objects got collected when my app loses functionality?
One problem is, that AFAIK, the LuaJIT FFI already assigns custom __gc handlers, so I cannot add my own, as there can be only one per object. And anyway, the framework is too big for me to try adding __gc in each and every imaginable place in it. Also, I've already eliminated the "most obviously suspected" places in the code, by removing local from some variables — thus making them part of _G, so I assume not GC-able. (Or is that not enough?)
1 Specifically, WinAPI.

For now, I've added some ffi.gc() handlers to some of my objects (printing some easily visible ALL-CAPS messages), then added some eager collectgarbage() calls to try triggering the issue as soon as possible:
ffi.gc(foo, function()
print '\n\nGC FOO !!!\n\n'
end)
[...]
collectgarbage()
And indeed, this exposed some GCing I didn't expect. Specifically, it led me to discover a note in luajit's FFI docs, which is most certainly relevant in my case:
Please note that [C] pointers [...] are not followed by the garbage collector. So e.g. if you assign a cdata array to a pointer, you must keep the cdata object holding the array alive [in Lua] as long as the pointer is still in use.

Source of Node<Object> in the Visual Studio memory snapshot

I am doing memory profiling for my application using Visual Studio diagnostic tool. I find that there is Node takes up a lot of memory (based on Inclusive Size Diff. (bytes). (see below #1). And when I click on the first instance of Node, 'Referenced Objects', I see Node is referencing other Node. And I see something like 'Overlapped data' in the attribute.
How can I find out where is creating these Node as they are from mscorlib.ni.dll.

The weapon of choice when you are rooting through these .NET Framework objects is a good decompiler. I use Reflector, there are others.
You see an opaque Node<T> object back. Just type it into the Search box, out pops but a few types that use it. Most are in the System.Collections.Concurrent namespace. Well, look no further, the profiler already told you about that one. Clearly it is the Stack<T> class in the System.Collections.Concurrent namespace that's storing Nodes.
Your profiler told you there is just one Stack<> class object that owns these objects. Well good, that narrows it down to just a single object. It just happens to have 208 elements. Hmm, well, not that much, is it?
That's not where you have to stop, the Stack<> class is a pretty useless class, nobody ever actually uses it in their code. Keep using the decompiler and let it search for usages of that class.
Ah, nice, that's a very short list as well. You see System.Data.ProviderBase show up several times, hmm, this question probably isn't related to querying dbases. Only other set of references is System.PinnableBufferCache.
"Pinnable buffers", whoa, that's a match. Pinning a buffer is important when you ask native code to get a job done to fill a managed array. With BeginRead(), the universal asynchronous I/O call. The driver needs a stable reference to the array while it is working on the async I/O request. Getting a stable buffer requires pinning in .NET. And big bingo in the profiler data, you see OverlappedData, the core data-structure in Windows to do async I/O.
Long story short, you found this guy's project. Programmers noticed, not very often.
Knowing when the stop profiling is very important. You cannot change code written by another programmer. And nobody at Microsoft thinks that guy did anything wrong. He didn't, caches are Good Things.
You are most definitely done. Congratulations.

Deleting WebGL contexts

I have a page that is using many many webgl contexts, one per canvas. canvases can be reloaded, resized, etc. each time creating new contexts. It works for several reloads, but eventually when I try to create a new context it returns a null value. I assume that I'm running out of memory.
I would like to be able to delete the contexts that I'm no longer using so I can recover the memory and use it for my new contexts. Is there any way to do this? Or is there a better way to handle many canvases?
Thanks.

This is a long standing bug in Chrome and WebKit
http://code.google.com/p/chromium/issues/detail?id=124238
There is no way to "delete" a context in WebGL. Contexts are deleted by garbage collection whenever the system gets around to it. All resources will be freed at that point but it's best if you delete your own resources rather than waiting for the browser to delete them.
As I said this is a bug. It will eventually be fixed but I have no ETA.
I would suggest not deleting canvases. Keep them around and re-use them.
Another suggestion would be tell us why you need 200 canvases. Maybe the problem you are trying to solve would be better solved a different way.

I would assume, that until you release all the resources attached to your context, something will still hold references to it, and therefore it will still exist.
A few things to try:
Here is some debug gl code. There is a function there to reset a context to it's initial state. Try that before deleting the canvas it belongs to.
It's possible that some event system could hold reference to your contexts, keeping them in a zombie state.
Are you deleting your canvases from the DOM? I'm sure there is a limit on resources the page can maintain in one instance.
SO was complaining that our comment thread was getting a bit long. Try these things first and let me know if it help.

How to use DoEvents() without being "evil"?

A simple search for DoEvents brings up lots of results that lead, basically, to:
DoEvents is evil. Don't use it. Use threading instead.
The reasons generally cited are:
Re-entrancy issues
Poor performance
Usability issues (e.g. drag/drop over a disabled window)
But some notable Win32 functions such as TrackPopupMenu and DoDragDrop perform their own message processing to keep the UI responsive, just like DoEvents does.
And yet, none of these seem to come across these issues (performance, re-entrancy, etc.).
How do they do it? How do they avoid the problems cited with DoEvents? (Or do they?)

DoEvents() is dangerous. But I bet you do lots of dangerous things every day. Just yesterday I set off a few explosive devices (future readers: note the original post date relative to a certain American holiday). With care, we can sometimes account for the dangers. Of course, that means knowing and understanding what the dangers are:
Re-entry issues. There are actually two dangers here:
Part of the problem here has to do with the call stack. If you call .DoEvents() in a loop that itself handles messages that use DoEvents(), and so on, you're getting a pretty deep call stack. It's easy to over-use DoEvents() and accidentally fill up your call stack, resulting in a StackOverflow exception. If you're only using .DoEvents() in one or two places, you're probably okay. If it's the first tool you reach for whenever you have a long-running process, you can easily find yourself in trouble here. Even one use in the wrong place can make it possible for a user to force a stackoverflow exception (sometimes just by holding down the enter key), and that can be a security issue.
It is sometimes possible to find your same method on the call stack twice. If you didn't build the method with this in mind (hint: you probably didn't) then bad things can happen. If everything passed in to the method is a value type, and there is no dependance on things outside of the method, you might be fine. But otherwise, you need to think carefully about what happens if your entire method were to run again before control is returned to you at the point where .DoEvents() is called. What parameters or resources outside of your method might be modified that you did not expect? Does your method change any objects, where both instances on the stack might be acting on the same object?
Performance Issues. DoEvents() can give the illusion of multi-threading, but it's not real mutlithreading. This has at least three real dangers:
When you call DoEvents(), you are giving control on your existing thread back to the message pump. The message pump might in turn give control to something else, and that something else might take a while. The result is that your original operation could take much longer to finish than if it were in a thread by itself that never yields control, definitely longer than it needs.
Duplication of work. Since it's possible to find yourself running the same method twice, and we already know this method is expensive/long-running (or you wouldn't need DoEvents() in the first place), even if you accounted for all the external dependencies mentioned above so there are no adverse side effects, you may still end up duplicating a lot of work.
The other issue is the extreme version of the first: a potential to deadlock. If something else in your program depends on your process finishing, and will block until it does, and that thing is called by the message pump from DoEvents(), your app will get stuck and become unresponsive. This may sound far-fetched, but in practice it's surprisingly easy to do accidentally, and the crashes are very hard to find and debug later. This is at the root of some of the hung app situations you may have experienced on your own computer.
Usability Issues. These are side-effects that result from not properly accounting for the other dangers. There's nothing new here, as long as you looked in other places appropriately.
If you can be sure you accounted for all these things, then go ahead. But really, if DoEvents() is the first place you look to solve UI responsiveness/updating issues, you're probably not accounting for all of those issues correctly. If it's not the first place you look, there are enough other options that I would question how you made it to considering DoEvents() at all. Today, DoEvents() exists mainly for compatibility with older code that came into being before other credible options where available, and as a crutch for newer programmers who haven't yet gained enough experience for exposure to the other options.
The reality is that most of the time, at least in the .Net world, a BackgroundWorker component is nearly as easy, at least once you've done it once or twice, and it will do the job in a safe way. More recently, the async/await pattern or the use of a Task can be much more effective and safe, without needing to delve into full-blown multi-threaded code on your own.

Back in 16-bit Windows days, when every task shared a single thread, the only way to keep a program responsive within a tight loop was DoEvents. It is this non-modal usage that is discouraged in favor of threads. Here's a typical example:
' Process image
For y = 1 To height
For x = 1 to width
ProcessPixel x, y
End For
DoEvents ' <-- DON'T DO THIS -- just put the whole loop in another thread
End For
For modal things (like tracking a popup), it is likely to still be OK.

I may be wrong, but it seems to me that DoDragDrop and TrackPopupMenu are rather special cases, in that they take over the UI, so don't have the reentrancy problem (which I think is the main reason people describe DoEvents as "Evil").
Personally I don't think it's helpful to dismiss a feature as "Evil" - rather explain the pitfalls so that people can decide for themselves. In the case of DoEvents there are rare cases where it's still reasonable to use it, for example while a modal progress dialog is displayed, where the user can't interact with the rest of the UI so there is no re-entrancy issue.
Of course, if by "Evil" you mean "something you shouldn't use without fully understanding the pitfalls", then I agree that DoEvents is evil.

Why does loading cached objects increase the memory consumption drastically when computing them will not?

Relevant background info
I've built a little software that can be customized via a config file. The config file is parsed and translated into a nested environment structure (e.g. .HIVE$db = an environment, .HIVE$db$user = "Horst", .HIVE$db$pw = "my password", .HIVE$regex$date = some regex for dates etc.)
I've built routines that can handle those nested environments (e.g. look up value "db/user" or "regex/date", change it etc.). The thing is that the initial parsing of the config files takes a long time and results in quite a big of an object (actually three to four, between 4 and 16 MB). So I thought "No problem, let's just cache them by saving the object(s) to .Rdata files". This works, but "loading" cached objects makes my Rterm process go through the roof with respect to RAM consumption (over 1 GB!!) and I still don't really understand why (this doesn't happen when I "compute" the object all anew, but that's exactly what I'm trying to avoid since it takes too long).
I already thought about maybe serializing it, but I haven't tested it as I would need to refactor my code a bit. Plus I'm not sure if it would affect the "loading back into R" part in just the same way as loading .Rdata files.
Question
Can anyone tell me why loading a previously computed object has such effects on memory consumption of my Rterm process (compared to computing it in every new process I start) and how best to avoid this?
If desired, I will also try to come up with an example, but it's a bit tricky to reproduce my exact scenario. Yet I'll try.

Its likely because the environments you are creating are carrying around their ancestors. If you don't need the ancestor information then set the parents of such environments to emptyenv() (or just don't use environments if you don't need them).
Also note that formulas (and, of course, functions) have environments so watch out for those too.

If it's not reproducible by others, it will be hard to answer. However, I do something quite similar to what you're doing, yet I use JSON files to store all of my values. Rather than parse the text, I use RJSONIO to convert everything to a list, and getting stuff from a list is very easy. (You could, if you want, convert to a hash, but it's nice to have layers of nested parameters.)
See this answer for an example of how I've done this kind of thing. If that works out for you, then you can forego the expensive translation step and the memory ballooning.
(Taking a stab at the original question...) I wonder if your issue is that you are using an environment rather than a list. Saving environments might be tricky in some contexts. Saving lists is no problem. Try using a list or try converting to/from an environment. You can use the as.list() and as.environment() functions for this.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio