How to profile garbage collection in Ruby - ruby

I'm trying to profile GC in a non-Rails application, preferably using YARV Ruby.
perftools.rb is telling me that the majority of my CPU time is spent in garbage_collector (6061 (61.4%)).
I'm also able to get how many objects are created by which methods with perftools.rb . Some methods create more objects than others, but it's not extremely skewed.
Where do I go from here? Is it possible to get more detailed information on why it's spending so much time doing GC? Is it possible to see whether the time is spent getting rid of objects, or whether it is spent checking whether an object should be garbage collected or not?
I have access to OS X Lion, Windows 7 and Ubuntu 12.04.

On osx you have dtrace. There are dtrace providers in YARV ruby.
You have a couple of probes related to GC that you can use:
gc-begin
gc-end
gc-mark-begin
gc-mark-end
gc-sweep-begin
gc-sweep-end
I think they can help finding what the GC in your program is doing. have a look at this file to see how use them: https://github.com/tenderlove/ruby/blob/probes/test/dtrace/test_gc.rb.
And this post for more explanations: http://tenderlovemaking.com/2011/06/29/i-want-dtrace-probes-in-ruby.html
There's a bug opened in ruby http://bugs.ruby-lang.org/issues/2565 where you can find a patch to apply to ruby to have those probes or you can use https://github.com/tenderlove/ruby/tree/probes where the patch is already applied.
Hope this helps

Related

How to provide an online ruby REPL?

On a site like www.codewars.com, one can run ruby in a sort of sandbox, almost identical to IRB.
How does this actually work?
If the submitted code is eval()d, what's preventing me from submitting a system("rm -rf *") or redefining basic functions so that 50% of the time Array.sort actually runs Array.shuffle?
The simplest and safest solution is to run the Ruby code on separate computer, which you wipe and re-install after every run. This is, however, also a pretty heavyweight solution.
More lightweight, but (almost) as safe, would be using a virtual machine or a container instead of a whole separate computer, and e.g. using a read-only filesystem with a ramfs overlay, which you umount after every run. (Or just throw away and recreate the container.)
You could also use JRuby together with the JVM's security features (or IronRuby with the CLI's). The JVM has sandboxing features for JVM programs, and after all, JRuby is just a Java program like any other.
Lastly, you could write your own Ruby implementation with sandboxing in mind, or modify an existing one. The three options above are fairly simple, this one is hard, because most Ruby implementations aren't designed for sandboxing. TryRuby.Com worked this way, for example, and it took a significant amount of time to update it for Ruby 1.9, because it was originally based on a modified version of MRI, but MRI doesn't support Ruby 1.9. So, the implementation had to be switched to YARV, and a lot of the modifications to make it sandboxing-safe had to re-implemented from scratch. (The JRuby/IronRuby option above is similar to this, but you push off the work of making the implementation sandbox-safe to someone else, e.g. Oracle or Microsoft.)
A not-so-safe but also simple solution would be to run the interpreter under a restricted user account.
Of course, you can combine multiple approaches for defense-in-depth, for example, running a sandboxed interpreter under a restricted user account on a separate VM.
What does not work is to statically analyze the code before running it. The pesky Halting Problem bites us here.

Since adding observers to my Ruby module, my system locks up

It only happens on certain types of errors, for example if I make a call to a method that doesn't exist on one of my objects. But it's hard to get any information on what is causing this because I can't step through what is causing it, as my debugger locks up as well. When I look at top, I see something like 97% of my CPU time being taken up by a Ruby process. I tried running Sample Process in activity monitor to see if it could show me where it is getting stuck, but nothing relevant seems to come up (just alot of OSX classes).
This is a Padrino project, I am running Ruby 1.9.2 and using the Observable mixin. I am on OSX Lion. Any ideas or suggestions for troubleshooting? This is killing my productivity!!
Which version of padrino do you have? Latest 0.10.1 fix this problem.

Slow loading of rails environment

Similar issue to Slow loading rails environment
Loading the rails environment takes quite a bit of time and I'm not sure exactly why.
time ruby -r./config/environment.rb -e ""
real 0m18.590s
user 0m17.200s
sys 0m1.320s
Are there any tools/ways that can help me find why it is spending so much time to load the environment?
The project is fairly large, so I am assuming that it is coming from all the gem dependencies, but I would think that it would be able to be improved somehow.
If you are using Ruby 1.9 then see this blog post it may be the issue you are experiencing. If it is it has to do with the amount of requires in your project and the way that method is implemented in 1.9. There is a patch available to improve this performance.
I tried patching my ruby with the rhnh patch cited above as well as the rvm-patchsets (on independent ruby installs of course) but didn't pick up a lot of performance. But some do it seems so maybe it's a ruby version or lower level issue.
My current workaround, at least in my dev environment, is to use rails-sh to preload the environment one time and then reuse it in your rails/rake commands. It's a big performance pickup. Wrote more details on it in this answer.

Track down Memory leaks in a Ruby Script

I have created a Ruby XMPP Framework called babylon. I have then created a few applications with it and even though they run pretty smoothly, it seems that they're eating my computer memory bit by bit.
I suspected leaks, so first, I added this at some point in my code :
puts `ps -o rss= -p #{Process.pid}`.to_i
As suspected, the output kept increasing... slowly, but surely.
I tried to hunt the leaks with Dike, like explained here.
Unfortunetely, Dike was not able to detect any leak. Even after it ran for a quite long time, it still returns the same objects.
So, how can I be sure that my framework is leaking, and not just taking some RAM until some maximum point and then starting to release it?
And then, how can I actually track the leaks and fix them?
Thanks for your help!
I've heard good things about the Ruby Memory Tracking API but it is not free.
There is also a useful blog post for using valgrind to find ruby memory leaks.
There are other solutions for Ruby on Rails but it doesn't seem like you are using rails at all.

Ruby Performance

I'm pretty keen to develop my first Ruby app, as my company has finally blessed its use internally.
In everything I've read about Ruby up to v1.8, there is never anything positive said about performance, but I've found nothing about version 1.9. The last figures I saw about 1.8 had it drastically slower than just about everything out there, so I'm hoping this was addressed in 1.9.
Has performance drastically improved? Are there some concrete things that can be done with Ruby apps (or things to avoid) to keep performance at the best possible level?
There are some benchmarks of 1.8 vs 1.9 at http://www.rubychan.de/share/yarv_speedups.html. Overall, it looks like 1.9 is a lot faster in most cases.
If scalability and performance are really important to you you can also check out Ruby Enterprise Edition. It's a custom implementation of the Ruby interpreter that's supposed to be much better about memory allocation and garbage collection. I haven't seen any objective metrics comparing it directly to JRuby, but all of the anectdotal evidence I've heard has been very very good.
This is from the same company that created Passenger (aka mod_rails) which you should definitely check out as a rails deployment solution if you decide not to go the JRuby route.
Matz ruby 1.8.6 is much slower when it comes to performance and 1.9 and JRuby do alot to speed it up. But the performance isn't such that it will prevent you from doing anything you want in a web application. There are many large Ruby on Rails sites that do just fine with the "slower interpreted" language. When you get to scaling out web apps there are many more pressing performance issues than the speed of the language you are writing it in.
I've actually heard really good things performance with about the JVM implementation, JRuby. Completly anecdotal, but perhaps worth looking into.
See also http://en.wikipedia.org/wiki/JRuby#Performance
Check out "Writing Efficient Ruby Code" from Addison Wesley Professional:
http://safari.oreilly.com/9780321540034
I found some very helpful and interesting insights in this short work. And if you sign up for the free 10-day trial you could read it for free. (It's 50 pages and the trial gets you (AFAIR) 100 page views.)
https://ssl.safaribooksonline.com/promo
I am not a Ruby programmer but I have been pretty tightly involved in a JRuby deployment lately and can thus draw some conclusions. Do not expect to much from JRuby's performance. In interpreted mode, it seems to be somewhere in the range of C Ruby. JIT mode might be faster, but only in theory. In practice, we tried JIT mode on Glassfish for a decently-sized Rails application on a medium-sized server (dual core, 8GB RAM). And the truth is, the JITting took so freakingly much time, that the server needed 20-30 minutes before it answered the first request. Memory usage was astronomic, profiling did not work because the whole system grinded to halt with a profiler attached.
Bottom line: JRuby has its merits (multithreading, solid platform, easy Java integration), but given that interpreted mode is the only mode that worked for us in practice, it may be expected to be no better performance-wise than C Ruby.
I'd second the recommendation of the use of Passenger - it makes deployment and management of Rails applications trivial

Resources