Comparing the "Size" of 2 LIbraries - ruby

If I wanted to compare the size of 2 libraries, say Sinatra vs. Rails, what would be the most accurate way to do that?
I was thinking of creating the following 2 docker images and comparing their sizes, but wasn't sure if I needed to compare the image, the container, or neither (for whatever reason):
A - {Base Stuff} + Ruby + Sinatra
B - {Base Stuff} + Ruby + Rails
Is that a good approach or would I want to do something different?

There are several ways in which you can technically compare library sizes but the real question is what do you get out of these comparisons.
Sinatra is "smaller" than Rails in pretty much every metric you can think of:
Lines of code
Memory usage by code
Memory usage of your application if you use the library
Number of dependencies
Lines of code in library + dependencies
Memory usage of library + dependencies
Sinatra also does a lot less than Rails (and I think it's accurate to say that everything that Sinatra does, Rails does too in some way).
But this comparison isn't easily generalized to arbitrary libraries because of the number of axes. Consider library A that is a thin wrapper around library C, and a library B:
Library A has fewer lines of code than library B
Library A + dependencies has more lines of code than library B
When you use either A or B they both utilize their dependencies, so which one is "smaller"?
Or, compare a pure Ruby library like net/http with a library like curb that is backed by a C extension or a C library. The pure Ruby library may have fewer lines of code but greater memory footprint and lower performance. The C library may have vastly more lines of code and higher performance and smaller memory footprint. But if you compare curb without curl (the backing C library) to net/http maybe curb has fewer lines of code.

Related

Does requiring from the standard library make a program slower?

I’m thinking of ruby specifically when I ask this question, but if the answer is language-agnostic, I’d like to know as well.
I often require from the ruby standard library, namely fileutils, open3, and pathname.
But if I only need to use their capabilities in one or two lines, I avoid them and go for (sometimes less readable) alternatives that don’t need a require.
However, using them doesn’t appear to hurt performance of my scripts, and even with quick benchmarks (using time), things seem to run at the same speed they would if I used different methods. But it seems weird to me it would make no difference, because why then have them be required for use (and not just included outright)? So, specific questions:
Does importing from the standard library make scripts slower?
If so, is it always negligible, or does it depend on the package?
What about third-party packages? Are they slower to import than the ones in the standard library?
Importing a library will take a non-zero amount of time, but the amount of time is directly proportional to what library. Some are very small, some much larger, but all of the ones that ship with Ruby are usually quick to load.
Unless you're running your script a thousand times a second the impact of a require is going to be minimal.
It's usually better to get all the require operations out of the way as early as possible to shake out any dependency problems, especially with gems. There's nothing worse than code that crashes because of a broken dependency, but only when you perform a particular action that doesn't happen often.
If you are starting this process thousands of times over, consider a tool like Spring or your own forking model to avoid the startup penalty. You can fork a pre-configured process any number of times, each of which will be ready nearly instantaneously.
When using require you are telling Ruby to load some file (your own file, standard library or external library).
So yes, it takes time to find and load it.
But the required file is loading only once (usually at start because usually require at the top of file).
Loads the given name, returning true if successful and false if the feature is already loaded.
The absolute path of the loaded file is added to $LOADED_FEATURES ($"). A file will not be loaded again if its path already appears in $"
And the time spent depends on the file size and the size of the associated files.

Ruby Efficiency

I was just listening to a lecture at coursera (https://www.coursera.org/saas/) and the professor was saying that everything in Ruby is an object and that every method call is calling a send method on an object, passing some params to it. This includes numbers, arrays and other basic classes.
I went on Google and looked for efficiency benchmarks and I found the following: http://benchmarksgame.alioth.debian.org/u32/which-programs-are-fastest.html
While it's not shocking that a compiled language is faster than an interpreted one, the performance difference between (Ruby, Python) and Java for instance is shocking.
Even if there's a way to compile ruby code (I have not researched this topic), I think the efficiency problem would still be there due to the core "problem" in the language:
Basic operations are being too heavy: 1+1 takes many more CPU cycles to complete.
I love Ruby. I love the high level aspect of meta programming and I think this is where the future should be heading, and I agree, sometimes we need to compromise certain thing in order to be more effective: I don't see myself optimizing my code in assembly in order to save a few extra milliseconds. However, when we do 1+1 in C, it's not exponentially increasing the amount of time a basic operation is taking!
My question is how are you guys dealing with operation intensive programs? We have a Ruby on Rails project we have been developing for about a year now and we're at a point we'll start doing some machine learning with geolocation traversal and prioritization.
I hope you understand my concerns and offer reasonable suggestions :-)
This is not such a bad question as it seems looking at the comments. The only problem is the wrongly used word exponential, but the described problem is real.
I have similar Ruby usage patterns as you describe - I'm doing a lot of natural language processing in Ruby, which involves machine learning as well.
I use the following techniques to overcome Ruby performance issues:
Use C libraries with Ruby interface whenever applicable. E.g. I would not use Ruby implementation of SVM or decision trees, since there are much faster implementations in C available.
Write my own Ruby wrappers for C implementation if they are not available. Usually this is not much problematic - I use RubyInline gem extensively to glue Ruby with C code.
Patch Ruby memory management or manually control its garbage collector.
Consider JRuby as your Ruby platform - you would get easy access to fast Java libraries for machine learning (Weka) and the like and general performance of your application would be preserved or even better.

Sparse Matrix Libraries for Ruby

I'm looking for a Sparse Matrix library I can use from Ruby. I'm currently using the GNU Scientific Library bindings provided by the "gsl" gem, but my application would be better optimized if I used a dedicated sparse matrix library. I've investigated the linalg and NArray libraries. None of the these three libraries support sparse-matrix optimised storage or operations.
Is there anything out there I've missed - or an existing C library that may be possible to write bindings for? I'd prefer the former to that latter, as I haven't written C bindings in Ruby before, but I would be willing to attempt it.
Like Bill mentioned above, the a pure ruby interpretation is going to be slower than you want, but might be good for prototyping. I have been working on just such a library over at https://github.com/hmcfletch/sparse-matrix
I haven't released it as a gem yet and there is more work to be done on it, but take a look at if you stil have a need.
Pure ruby solutions are going to be ridiculously slow. I'd be tempted to pick up something like MTJ (http://code.google.com/p/matrix-toolkits-java/) and use it under JRuby.
There's a bunch of java code out there; much of it is pretty mature, although I don't know the space well enough to recommend a particular library. I can tell you that I've used java from jruby often and it's a joy to work with.
Have you seen SciRuby?
We don't have a sparse matrix implemented currently, but we're working on it. We're also in the process of rewriting NArray, with Masahiro Tanaka's blessing.
One goal is to have everything working in pure Ruby, in C (via GSL bindings, typically), and in Java for JRuby. (Pure Ruby would then be the fallback if GSL, etc., were unavailable.)
Side note: This is a terrible answer to this question. I post it here mainly so that anyone else who happens to be working on such things knows where to find us. =)

Convert Ruby to low level languages?

I have all kind of scripting with Ruby:
rails (symfony)
ruby (php, bash)
rb-appscript (applescript)
Is it possible to replace low level languages with Ruby too?
I write in Ruby and it converts it to java, c++ or c.
Cause People say that when it comes to more performance critical tasks in Ruby, you could extend it with C. But the word extend means that you write C files that you just call in your Ruby code. I wonder, could I instead use Ruby and convert it to C source code which will be compiled to machine code. Then I could "extend" it with C but in Ruby code.
That is what this post is about. Write everything in Ruby but get the performance of C (or Java).
The second advantage is that you don't have to learn other languages.
Just like HipHop for PHP.
Are there implementations for this?
Such a compiler would be an enormous piece of work. Even if it works, it still has to
include the ruby runtime
include the standard library (which wasn't built for performance but for usability)
allow for metaprogramming
do dynamic dispatch
etc.
All of these inflict tremendous runtime penalties, because a C compiler can neither understand nor optimize such abstractions. Ruby and other dynamic languages are not only slower because they are interpreted (or compiled to bytecode which is then interpreted), but also because they are dynamic.
Example
In C++, a method call can be inlined in most cases, because the compiler knows the exact type of this. If a subtype is passed, the method still can't change unless it is virtual, in which case a still very efficient lookup table is used.
In Ruby, classes and methods can change in any way at any time, thus a (relatively expensive) lookup is required every time.
Languages like Ruby, Python or Perl have many features that simply are expensive, and most if not all relevant programs rely heavily on these features (of course, they are extremely useful!), so they cannot be removed or inlined.
Simply put: Dynamic languages are very hard to optimize, simply doing what an interpreter would do and compiling that to machine code doesn't cut it. It's possible to get incredible speed out of dynamic languages, as V8 proves, but you have to throw huge piles of money and offices full of clever programmers at it.
There is https://github.com/seattlerb/ruby_to_c Ruby To C compiler. It actually only takes in a subset of Ruby though. I believe the main missing part is the Meta Programming features
In a recent interview (November 16th, 2012) Yukihiro "Matz" Matsumoto (creator of Ruby) talked about compiling Ruby to C
(...) In University of Tokyo a research student is working on an academic research project that compiles Ruby code to C code before compiling the binary code. The process involves techniques such as type inference, and in optimal scenarios the speed could reach up to 90% of typical hand-written C code. So far there is only a paper published, no open source code yet, but I’m hoping next year everything will be revealed... (from interview)
Just one student is not much, but it might be an interesting project. Probably a long way to go to full support of Ruby.
"Low level" is highly subjective. Many people draw the line differently, so for the sake of this argument, I'm just going to assume you mean compiling Ruby down to an intermediate form which can then be turned into machine code for your particular platform. I.e., compiling ruby to C or LLVM IR, or something of that nature.
The short answer is yes this is possible.
The longer answer goes something like this:
Several languages (Objective-C most notably) exist as a thin layer over other languages. ObjC syntax is really just a loose wrapper around the objc_*() libobjc runtime calls, for all practical purposes.
Knowing this, then what does the compiler do? Well, it basically works as any C compiler would, but also takes the objc-specific stuff, and generates the appropriate C function calls to interact with the objc runtime.
A ruby compiler could be implemented in similar terms.
It should also be noted however, that just by converting one language to a lower level form does not mean that language is instantly going to perform better, though it does not mean it will perform worse either. You really have to ask yourself why you're wanting to do it, and if that is a good reason.
There is also JRuby, if you still consider Java a low level language. Actually, the language itself has little to do here: it is possible to compile to JVM bytecode, which is independent of the language.
Performance doesn't come solely from "low level" compiled languages. Cross-compiling your Ruby program to convoluted, automatically generated C code isn't going to help either. This will likely just confuse things, include long compile times, etc. And there are much better ways.
But you first say "low level languages" and then mention Java. Java is not a low-level language. It's just one step below Ruby in terms of high- or low-level languages. But if you look at how Java works, the JVM, bytecode and just-in-time compilation, you can see how high level languages can be fast(er). Ruby is currently doing something similar. MRI 1.8 was an interpreted language, and had some performance problems. 1.9 is much faster, it's using a bytecode interpreter. I'm not sure if it'll ever happen on MRI, but Ruby is just one step away from JIT on MRI.
I'm not sure about the technologies behind jRuby and IronRuby, but they may already be doing this. However, both have their own advantages and disadvantages. I tend to stick with MRI, it's fast enough and it works just fine.
It is probably feasible to design a compiler that converts Ruby source code to C++. Ruby programs can be compiled to Python using the unholy compiler, so they could be compiled from Python to C++ using the Nuitka compiler.
The unholy compiler was developed more than a decade ago, but it might still work with current versions of Python.
Ruby2Cextension is another compiler that translates a subset of Ruby to C++, though it hasn't been updated since 2008.

What makes Ruby slow? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Ruby is slow at certain things. But what parts of it are the most problematic?
How much does the garbage collector affect performance? I know I've had times when running the garbage collector alone took several seconds, especially when working with OpenGL libraries.
I've used matrix math libraries with Ruby that were particularly slow. Is there an issue with how ruby implements basic math?
Are there any dynamic features in Ruby that simply cannot be implemented efficiently? If so, how do other languages like Lua and Python solve these problems?
Has there been recent work that has significantly improved performance?
Ruby is slow. But what parts of it are the most problematic?
It does "late lookup" for methods, to allow for flexibility. This slows it down quite a bit. It also has to remember variable names per context to allow for eval, so its frames and method calls are slower. Also it lacks a good JIT compiler currently, though MRI 1.9 has a bytecode compiler (which is better), and jruby compiles it down to java bytecode, which then (can) compile via the HotSpot JVM's JIT compiler, but it ends up being about the same speed as 1.9.
How much does the garbage collector effect performance? I know I've had times when running the garbage collector alone took several seconds, especially when working with OpenGL libraries.
from some of the graphs at http://www.igvita.com/2009/06/13/profiling-ruby-with-googles-perftools/ I'd say it takes about 10% which is quite a bit--you can decrease that hit by increasing the malloc_limit in gc.c and recompiling.
I've used matrix math libraries with Ruby that were particularly slow. Is there an issue with how ruby implements basic math?
Ruby 1.8 "didn't" implement basic math it implemented Numeric classes and you'd call things like Fixnum#+ Fixnum#/ once per call--which was slow. Ruby 1.9 cheats a bit by inlining some of the basic math ops.
Are there any dynamic features in Ruby that simply cannot be implemented efficiently? If so, how do other languages like Lua and Python solve these problems?
Things like eval are hard to implement efficiently, though much work can be done, I'm sure. The kicker for Ruby is that it has to accomodate for somebody in another thread changing the definition of a class spontaneously, so it has to be very conservative.
Has there been recent work that has significantly improved performance?
1.9 is like a 2x speedup. It's also more space efficient. JRuby is constantly trying to improve speed-wise [and probably spends less time in the GC than KRI]. Besides that I'm not aware of much except little hobby things I've been working on. Note also that 1.9's strings are at times slower because of encoding friendliness.
Ruby is very good for delivering solutions quickly. Less so for delivering quick solutions. It depends what kind of problem you're trying to solve. I'm reminded of the discussions on the old CompuServe MSBASIC forum in the early 90s: when asked which was faster for Windows development, VB or C, the usual answer was "VB, by about 6 months".
In its MRI 1.8 form, Ruby is - relatively - slow to perform some types of computationally-intensive tasks. Pretty much any interpreted language suffers in that way in comparison to most mainstream compiled languages.
The reasons are several: some fairly easily addressable (the primitive garbage collection in 1.8, for example), some less so.
1.9 addresses some of the issues, although it's probably going to be some time before it becomes generally available. Some of the other implementation that target pre-existing runtimes, JRuby, IronRuby, MagLev for example, have the potential to be significantly quicker.
Regarding mathematical performance, I wouldn't be surprised to see fairly slow throughput: it's part of the price you pay for arbitrary precision. Again, pick your problem. I've solved 70+ of the Project Euler problems in Ruby with almost no solution taking more than a mintue to run. How fast do you need it to run and how soon do you need it?
The most problematic part is "everyone".
Bonus points if that "everyone" didn't really use the language, ever.
Seriously, 1.9 is much faster and now is on par with python, and jruby is faster than jython.
Garbage collectors are everywhere; for example, Java has one, and it's faster than C++ on dynamic memory handling. Ruby isn't suited well for number crunching; but few languages are, so if you have computational-intensive parts in your program in any language, you better rewrite them in C (Java is fast with math due to its primitive types, but it paid dearly for them, they're clearly #1 in ugliest parts of the language).
As for dynamic features: they aren't fast, but code without them in static languages can be even slower; for example, java would use a XML config instead of Ruby using a DSL; and it would likely be SLOWER since XML parsing is costly.
Hmm - I worked on a project a few years ago where I scraped the barrel with Ruby performance, and I'm not sure much has changed since. Right now it's caveat emptor - you have to know not to do certain things, and frankly games / realtime applications would be one of them (since you mention OpenGL).
The culprit for killing interactive performance is the garbage collector - others here mention that Java and other environments have garbage collection too, but Ruby's has to stop the world to run. That is to say, it has to stop running your program, scan through every register and memory pointer from scratch, mark the memory that's still in use, and free the rest. The process can't be interrupted while this happens, and as you might have noticed, it can take hundreds of milliseconds.
Its frequency and length of execution is proportional to the number of objects you create and destroy, but unless you disable it altogether, you have no control. My experience was there were several unsatisfactory strategies to smooth out my Ruby animation loop:
GC.disable / GC.enable around critical animation loops and maybe an opportunistic GC.start to force it to go when it can't do any harm. (because my target platform at the time was a 64MB Windows NT machine, this caused the system to run out of memory occasionally. But fundamentally it's a bad idea - unless you can pre-calculate how much memory you might need before doing this, you're risking memory exhaustion)
Reduce the number of objects you create so the GC has less work to do (reduces the frequency / length of its execution)
Rewrite your animation loop in C (a cop-out, but the one I went with!)
These days I would probably also see if JRuby would work as an alternative runtime, as I believe it relies on Java's more sophisticated garbage collector.
The other major performance issue I've found is basic I/O when trying to write a TFTP server in Ruby a while back (yeah I pick all the best languages for my performance-critical projects this was was just an experiment). The absolute simplest tightest loop to simply respond to one UDP packet with another, contaning the next piece of a file, must have been about 20x slower than the stock C version. I suspect there might have been some improvements to make there based around using low-level IO (sysread etc.) but the slowness might just be in the fact there is no low-level byte data type - every little read is copied out into a String. This is just speculation though, I didn't take this project much further but it warned me off relying on snappy I/O.
The main speed recent increase that has gone on, though I'm not fully up-to-date here, is that the virtual machine implementation was redone for 1.9, resulting in faster code execution. However I don't think the GC has changed, and I'm pretty sure there's nothing new on the I/O front. But I'm not fully up-to-date on bleeding-edge Ruby so someone else might want to chip in here.
I assume that you're asking, "what particular techniques in Ruby tend to be slow."
One is object instantiation. If you are doing large amounts of it, you want to look at (reasonable) ways of reducing that, such as using the flyweight pattern, even if memory usage is not a problem. In one library where I reworked it not to be creating a lot of very similar objects over and over again, I doubled the overall speed of the library.
Steve Dekorte: "Writing a Mandelbrot set calculator in a high level language is like trying to run the Indy 500 in a bus."
http://www.dekorte.com/blog/blog.cgi?do=item&id=4047
I recommend to learn various tools in order to use the right tool for the job. Doing matrix transformations could be done efficiently using high-level API which wraps around tight loops with arithmetic-intensive computations. See RubyInline gem for an example of embedding C or C++ code into Ruby script.
There is also Io language which is much slower than Ruby, but it efficiently renders movies in Pixar and outperforms raw C on vector arithmetics by using SIMD acceleration.
http://iolanguage.com
https://renderman.pixar.com/products/tools/it.html
http://iolanguage.com/scm/git/checkout/Io/docs/IoGuide.html#Primitives-Vector
Ruby 1.9.1 is about twice as fast as PHP, and a little bit faster than Perl, according to some benchmarks.
(Update: My source is this (screenshot). I don't know what his source is, though.)
Ruby is not slow. The old 1.8 is, but the current Ruby isn't.
Ruby is slow because it was designed to optimize the programmers experience, not the program's execution time. Slowness is just a symptom of that design decision. If you would prefer performance to pleasure, you should probably use a different language. Ruby's not for everything.
IMO, dynamic languages are all slow in general. They do something in runtime that static languages do in compiling time.
Syntax Check, Interpreting and Like type checking, converting. this is inevitable, therefore ruby is slower than c/c++/java, correct me if I am wrong.

Resources