Find memory leak in very complex Ruby app - ruby

everyone!
It's nice to work with Ruby and write some code. But in past of this week, i notice that we have some problem in our application. Memory usage is growing like O(x*3) function.
Our application very complex, it is based on EventMachine and other external libs. Even more, it is running under amd64 bit version of FreeBSD using Ruby 1.8.7-p382
I'v tried to research by myself the way how find memory leak in our app.
I've found many tools and libs, but they doesn't work under FreeBSD'64bit and I have no idea how step up to find leaks in huge ruby application. It's OK, if you have few files with 200-300 lines of code, but here you have around 30 files with average 200-300 line's of code.
I just realize, i need too much of time to find those leaks, doing stupid actions: believe/research/assume that some of part of this code is may be actually leaking and wrap some tracking code, like using ruby-prof gem technice. But it's so painfully slow way, because as i said we have too much of code.
So, my question is how to find memory leak in very complex Ruby app and not put all my life into this work?
Thx in advance

One thing to try, even though it can massively degrade performance, is to manually trigger the garbage collector by calling GC.start every so often. How often is kind of subjective, as the more you run it the slower the app, and the less you run it the higher the memory footprint.
For whatever reason, the garbage collector may go on vacation from time to time, presumably not wanting to interfere if there is some heavy processing going on. As such you may have to manually call to have your trash taken away.
One way to avoid creating trash is to use memory more efficiently. Don't create hashes when arrays will do the job, don't create arrays when a single string will suffice, and so on. It will be important to profile your application to see what kind of objects are cluttering up your heap before you just start hacking away randomly.
If you can, try and use 1.9.2 which has made significant gains in terms of memory management. Ruby Enterprise Edition is also an option if you need 1.8.7 compatibility, as it's essentially a better garbage collector for that version.

How hard would it be to run your app on a linux box? If you don't have the same memory problems there, it is probably something specific with your ruby runtime. If you do have the same problems, you can use all the tools and libs that are linux only.
Another alternative - can you wrap your unit tests with some memory tracking code? Most unit test frameworks make it easy to add some code before/after each test. Or you could just run each test 1000000000 times and see if the memory goes out of control? if it does, you know something that happens in that test is causing the leak, and you can continue to isolate the problem.

Have you tried counting the number of objects you have, using ObjectSpace.each_object? Although you're intending to use small batches, maybe you only have more objects that you think.
count = ObjectSpace.each_object() {}
# => 7216

Related

Go not using cpu fully

I've been playing around with a simple raytracer in go that has been working pretty neatly so far. I'm using multiple goroutines to render different parts of the image, which then place their results into a shared film.
Against my expectations, my go code is still about 3 times slower than equivalent java code. Was that to be expected? Further, when inspecting the CPU-Usage in htop, I discovered that every core is only used to about 85%. Is that an issue with htop or is there a problem with my code? Here is the cpu profile of my application
I did set GOMAXPROCS as runtime.GOMAXPROCS(runtime.NumCPU()). The full code is on github.
I would guess that garbage collector is the problem. Maybe you are making a lot of unnecessary allocations. By using runtime.ReadMemStats you can find out how much time garbage collector has been running.
If this is the case then you must find a way to reduce memory allocations. By using pools of objects for example. Look at sync.Pool. Also there are few useful links that you can find via Google that explain how to reduce memory allocation. Look at this one for example.

Amount of acceptable Memory Leak

How much memory leak is negligible?
In my program, I am using Unity and when I go through Profile > Leaks and work with the project it shows about total 16KB memory Leak caused by Unity and I cannot help that.
EDIT: After a long play with the program it amounts to a 400KB leak.
What should I do? Is this amount of memory leak acceptable for an iPad project?
It's not great, but it won't get your app rejected unless it causes a crash in front of a reviewer. The size is less important than how often it occurs. If it only occurs once every time the app is run, that's not a big deal. If it happens every time the user does something, then that's more of a problem.
It's probably a good idea for you to track down these bugs and fix them, because Objective C memory management is quite different compared to Java, and it's good to get some practice in with smaller stuff before you're stuck trying to debug a huge problem with a deadline looming.
First, look if you can use Unity in another way to circumvent the leak (if you have enough insight into the workings of this framework).
Then, report the leakage to the Unity developers if not already done (by you or someone else).
Third, if you absolutely rely on this framework, hope it get fixed ASAP, unless switching to another framework is an option for you.
A 400K leak is not a very big deal unless it amounts to that size within few minutes. Though, no matter how small the leak, it is always necessary to keep an eye on any leak caused by your or third party code and try to get rid of them in the next minor or major iteration of your app.

Why do you use the keyword delete?

I understand that delete returns memory to the heap that was allocated of the heap, but what is the point? Computers have plenty of memory don't they? And all of the memory is returned as soon as you "X" out of the program.
Example:
Consider a server that allocates an object Packet for each packet it receives (this is bad design for the sake of the example).
A server, by nature, is intended to never shut down. If you never delete the thousands of Packet your server handles per second, your system is going to swamp and crash in a few minutes.
Another example:
Consider a video game that allocates particles for the special effect, everytime a new explosion is created (and never deletes them). In a game like Starcraft (or other recent ones), after a few minutes of hilarity and destruction (and hundres of thousands of particles), lag will be so huge that your game will turn into a PowerPoint slideshow, effectively making your player unhappy.
Not all programs exit quickly.
Some applications may run for hours, days or longer. Daemons may be designed to run without cease. Programs can easily consume more memory over their lifetime than available on the machine.
In addition, not all programs run in isolation. Most need to share resources with other applications.
There are a lot of reasons why you should manage your memory usage, as well as any other computer resources you use:
What might start off as a lightweight program could soon become more complex, depending on your design areas of memory consumption may grow exponentially.
Remember you are sharing memory resources with other programs. Being a good neighbour allows other processes to use the memory you free up, and helps to keep the entire system stable.
You don't know how long your program might run for. Some people hibernate their session (or never shut their computer down) and might keep your program running for years.
There are many other reasons, I suggest researching on memory allocation for more details on the do's and don'ts.
I see your point, what computers have lots of memory but you are wrong. As an engineer you have to create programs, what uses computer resources properly.
Imagine, you made program which runs all the time then computer is on. It sometimes creates some objects/variables with "new". After some time you don't need them anymore and you don't delete them. Such a situation occurs time to time and you just make some RAM out of stock. After a while user have to terminate your program and launch it again. It is not so bad but it not so comfortable, what is more, your program may be loading for a while. Because of these user feels bad of your silly decision.
Another thing. Then you use "new" to create object you call constructor and "delete" calls destructor. Lets say you need to open so file and destructor closes it and makes it accessible for other processes in this case you would steel not only memory but also files from other processes.
If you don't want to use "delete" you can use shared pointers (it has garbage collector).
It can be found in STL, std::shared_ptr, it has one disatvantage, WIN XP SP 2 and older do not support this. So if you want to create something for public you should use boost it also has boost::shared_ptr. To use boost you need to download it from here and configure your development environment to use it.

How to detect high contention critical sections?

My application uses many critical sections, and I want to know which of them might cause high contention. I want to avoid bottlenecks, to ensure scalability, especially on multi-core, multi-processor systems.
I already found one accidentally when I noticed many threads hanging while waiting to enter critical section when application was under heavy load. That was rather easy to fix, but how to detect such high contention critical sections before they become a real problem?
I know there is a way to create a full dump and get that info from it (somehow?). But this is rather intrusive way. Are there methods application can do on the fly to diagnose itself for such issues?
I could use data from structure _RTL_CRITICAL_SECTION_DEBUG, but there are notes that this could be unsafe across different Windows versions: http://blogs.msdn.com/b/oldnewthing/archive/2005/07/01/434648.aspx
Can someone suggest a reliable and not too complex method to get such info?
What you're talking about makes perfect sense during testing, but isn't really feasible in production code.
I mean.. you CAN do things in production code, such as determine the LockCount and RecursionCount values (this is documented), subtract RecursionCount from LockCount and presto, you have the # of threads waiting to get their hands on the CRITICAL_SECTION object.
You may even want to go deeper. The RTL_CRITICAL_SECTION_DEBUG structure IS documented in the SDK. The only thing that ever changed regarding this structure was that some reserved fields were given names and were put to use. I mean.. it's in the SDK headers (winnt.h), documented fields do NOT change. You misunderstood Raymond's story. (He's partially at fault, he likes a sensation as much as the next guy.)
My general point is, if there's heavy lock contention in your application, you should, by all means, ferret it out. But don't ever make the code inside a critical section bigger if you can avoid it. And reading the debug structure (or even lockcount/recursioncount) should only ever happen when you're holding the object. It's fine in a debug/testing version, but it should not go into production.
There are other ways to handle concurrency besides critical sections (i.e semaphores). One of the best ways is non-blocking synchronization. That means structuring your code to not require blocking even with shared resources. You shoudl read up on concurrency. Also, you can post a code snippet here and someone can give you advice on how ways to improve your concurrecy code.
Take a look at Intel Thread Profiler. It should be able to help to spot such problems.
Also you may want to instrument your code by wrapping critical sections in a proxy that dumps data on the disk for analysis. It really depends on the app itself, but it could be at least the information how long thread been waiting for the CS.

Compact Framework and JIT. How long could it take

We have/had a phantom delay in our app. This was traced to the initialisation of a singleton when the object was touched for the first time and was blamed on JIT. I'm not utterly convinced by this as there is no mechanism for measuring JIT (or is there?) and the entire delay was seven seconds. Seven seconds of JIT?!? Could that be forreal?
Either way I have difficulty in blaming things that one cannot easily measure. When I had a glance at the issue a while back I commented out a bunch of code and watched the seven second delay "jump" elsewhere in the app. Suggesting it is somehow happening on a background process somewhere (and I guess this would count JIT in as a potential cause).
Just for fun if there was a static object that happened to reference a lot of other objects does anyone have a rule of thumb for how long the JIT might take? Does anyone have further references so I can understand more about the JIT so I stand a chance of learning whether or not JIT is/was to blame for this slow down?
I've only seen JIT take a really long time (greater than 1 second) in a weird bug that had to do with templated items inside a templated collection (see edit below).
At any rate, the fact you see it "move" definitely indicates to me that it probably isn't the issue. To try to determine this definitively I'd look at using RPM to see what's happening right before and after the delay.
Expected JIT time is a really nebulous thing, since there are so many factors that can affect it. Processor speed is an obvious one, but less obvious might be things like app storage media and device memory pressure.
Storage media can affect JIT speed because the JITter has to pull the IL from the media when it needs to JIT it, and if pulling it is slow, then JITting it will be slow.
Memory pressure is a tough one, and can have serious repercussions on a CE device. The issue here is that when you start running out of memory, the EE will start pitching JITted code during collection - everything but the call stack. Now if you're in a routine that, for example, calls out to some worker or helper stuff, or has a thread running, then that helper method could be getting pitched, JITted, pitched JITted, etc. This is referred to as "thrash."
Identifying the latter is fairly easy with RPM (fixing it may not be so easy). Look at the amount of code pitched to raise frequently and look for a strong correlation between a rise in the number of pitches and your perceived lock ups.
Edit: I finally found the bug description here.
JIT (and GC) timers etc. can be found here:
Performance Counters in the .NET Compact Framework
(http://msdn.microsoft.com/en-us/library/ms172525.aspx)
Monitoring Application Performance on the .NET Compact Framework Part I - Enabling performance counters (http://blogs.msdn.com/davidklinems/archive/2005/10/04/476988.aspx)
Analyzing Device Application Performance with the .Net Compact Framework Remote Performance Monitor (http://blogs.msdn.com/stevenpr/archive/2006/04/17/577636.aspx)
Performance Counters in the .NET Framework
(http://msdn.microsoft.com/en-us/library/w8f5kw2e(VS.80).aspx)
Regards,
tamberg

Resources