Making ruby program run on all processors - ruby

I've been looking at optimizing a ruby program that's quite calculation intensive on a lot of data. I don't know C and have chosen Ruby (not that I know it well either) and I'm quite happy with the results, apart from the time it takes to execute. It is a lot of data, and without spending any money, I'd like to know what I can do to make sure I'm maximizing my own systems resources.
When I run a basic Ruby program, does it use a single processor? If I have not specifically assigned tasks to a processor, Ruby won't read my program and magically load each processor to complete the program as fast as possible will it? I'm assuming no...
I've been reading a bit on speeding up Ruby, and in another thread read that Ruby does not support true multithreading (though it said JRuby does). But, if I were to "break up" my program into two chunks that can be run in separate instances and run these in parralel...would these two chunks run on two separate processors automatically? If I had four processors and opened up four shells and ran four separate parts (1/4) of the program - would it complete in 1/4 the time?
Update
After reading the comments I decided to give JRuby a shot. Porting the app over wasn't that difficult. I haven't used "peach" yet, but just by running it in JRuby, the app runs in 1/4 the time!!! Insane. I didn't expect that much of a change. Going to give .peach a shot now and see how that improves things. Still can't believe that boost.
Update #2
Just gave peach a try. Ended up shaving another 15% off the time. So switching to JRuby and using Peach was definitely worth it.
Thanks everyone!

Use JRuby and the peach gem, and it couldn't be easier. Just replace an .each with .peach and voila, you're executing in parallel. And there are additional options to control exactly how many threads are spawned, etc. I have used this and it works great.
You get close to n times speedup, where n is the number of CPUs/cores available. I find that the optimal number of threads is slightly more than the number of CPUs/cores.

Like others have said the MRI implementation of ruby (the one most people use) does not support native threads. Hence you can not split work between CPU cores by launching more threads using the MRI implementation.
However if your process is IO-bound (restricted by disk or network activity for example), then you may still benefit from multiple MRI-threads.
JRuby on the other hand does support native threads, meaning you can use threads to split work between CPU cores.
But all is not lost. With MRI (and all the other ruby implementations), you can still use processes to split work.
This can be done using Process.fork for example like this:
Process.fork {
10.times {
# Do some work in process 1
sleep 1
puts "Hello 1"
}
}
Process.fork {
10.times {
# Do some work in process 2
sleep 1
puts "Hello 2"
}
}
# Wait for the child processes to finish
Process.wait
Using fork will split the processing between CPU cores, so if you can live without threads then separate processes are one way to do it.

As nice as ruby is, it's not known for its speed of execution. That being said, if, as noted in your comment, you can break up the input into equal-sized chunks you should be able to start up n instances of the program, where n is the number of cores you have, and the OS will take care of using all the cores for you.
In the best case it would run in 1/n the time, but this kind of thing can be tricky to get exactly right as some portions of the system, like memory, need to be shared between the processes and contention between processes can cause things not to scale linearly. If the split is easy to do I'd give it a try. You can also just try running the same program twice and see how long it takes to run, if it takes the same amount of time to run one as it does to run two you're likely all set, just split your data and go for it.
Trying jruby and some threads would probably help, but that adds a fair amount of complexity. (It would probably be a good excuse to learn about threading.)

Threading is usually considered one of Ruby's weak points, but it depends more on which implementation of Ruby you use.
A really good writeup on the different threading models is "Does ruby have real multithreading?".
From my experience and from what I gathered from people who know better about this stuff, it seems if you are going to chose a Ruby implementation, JRuby is the way to go. Though, if you are learning Ruby you might want to chose another language such has Erlang, or maybe Clojure, which are popular choices if you wanting to use the JVM.

Related

Multithreaded programming for CPUs with E/P cores on Alder Lake architecture

This is more a conceptual/high level question rather than a language specific one.
In terms of writing software for high performance applications that will use multiple threads, would it be advisable to manually check for and only allocate threads to P cores, or rather just simply allocate threads as they were done before Alder Lake, and let the OS scheduler decide where to put them?
To be more specific, my program will be a computer game with separate computationally expensive CPU threads for AI, pathfinding, etc. Ideally I don't want these threads on E cores but I'm wondering if I should be leaving this sort of thing up to the OS to decide instead of ensuring it manually.
This is probably not a good question for SO, and might fit better on https://softwareengineering.stackexchange.com, but here goes with a conceptual/high-level answer anyways.
In terms of writing software for high performance applications you will get best performance by writing platform-dependent code, that is by writing programs which are informed by, and take advantage of, the particular features of the hardware+o/s+runtime on which the programs are to execute.
The costs of this approach are that the code will, by definition, be less-than-optimal-performancewise on any other platform; and that writing codes to squeeze out every last drop of performance for a particular problem can be quite difficult and time consuming.
Personally (so this might be an opinion, which is something SO doesn't like) I would first write the platform-neutral version of the code and test it. Only when I was convinced that I couldn't achieve necessary performance (or other) goals would I roll up my sleeves and develop that first version into a platform-dependent version. (Well, I might do this extra work for fun, but you catch my drift).
Later, if you want to move the program to another platform you already have the platform-neutral version to start with.

Can we time commands deterministically?

We know that in bash, time foo will tell us how long a command foo takes to execute. But there is so much variability, depending on unrelated factors including what else is running on the machine at the time. It seems like there should be some deterministic way of measuring how long a program takes to run. Number of processor cycles, perhaps? Number of pipeline stages?
Is there a way to do this, or if not, to at least get a more meaningful time measurement?
You've stumbled into a problem that's (much) harder than it appears. The performance of a program is absolutely connected to the current state of the machine in which it is running. This includes, but is not limited to:
The contents of all CPU caches.
The current contents of system memory, including any disk caching.
Any other processes running on the machine and the resources they're currently using.
The scheduling decisions the OS makes about where and when to run your program.
...the list goes on and on.
If you want a truly repeatable benchmark, you'll have to take explicit steps to control for all of the above. This means flushing caches, removing interference from other programs, and controlling how your job gets run. This isn't an easy task, by any means.
The good news is that, depending on what you're looking for, you might be able to get away with something less rigorous. If you run the job on your regular workload and it produces results in a good amount of time, then that might be all that you need.

Would threading be beneficial for this situation?

I have a CSV file with over 1 million rows. I also have a database that contains such data in a formatted way.
I want to check and verify the data in the CSV file and the data in the database.
Is it beneficial/reduces time to thread reading from the CSV file and use a connection pool to the database?
How well does Ruby handle threading?
I am using MongoDB, also.
It's hard to say without knowing some more details about the specifics of what you want the app to feel like when someone initiates this comparison. So, to answer, some general advice that should apply fairly well regardless of the problem you might want to thread.
Threading does NOT make something computationally less costly
Threading doesn't make things less costly in terms of computation time. It just lets two things happen in parallel. So, beware that you're not falling into the common misconception that, "Threading makes my app faster because the user doesn't wait for things." - this isn't true, and threading actually adds quite a bit of complexity.
So, if you kick off this DB vs. CSV comparison task, threading isn't going to make that comparison take any less time. What it might do is allow you to tell the user, "Ok, I'm going to check that for you," right away, while doing the comparison in a separate thread of execution. You still have to figure out how to get back to the user when the comparison is done.
Think about WHY you want to thread, rather than simply approaching it as whether threading is a good solution for long tasks
Like I said above, threading doesn't make things faster. At best, it uses computing resources in a way that is either more efficient, or gives a better user experience, or both.
If the user of the app (maybe it's just you) doesn't mind waiting for the comparison to run, then don't add threading because you're just going to add complexity and it won't be any faster. If this comparison takes a long time and you'd rather "do it in the background" then threading might be an answer for you. Just be aware that if you do this you're then adding another concern, which is, how do you update the user when the background job is done?
Threading involves extra overhead and app complexity, which you will then have to manage within your app - tread lightly
There are other concerns as well, such as, how do I schedule that worker thread to make sure it doesn't hog the computing resources? Are the setting of thread priorities an option in my environment, and if so, how will adjusting them affect the use of computing resources?
Threading and the extra overhead involved will almost definitely make your comparison take LONGER (in terms of absolute time it takes to do the comparison). The real advantage is if you don't care about completion time (the time between when the comparison starts and when it is done) but instead the responsiveness of the app to the user, and/or the total throughput that can be achieved (e.g. the number of simultaneous comparisons you can be running, and as a result the total number of comparisons you can complete within a given time span).
Threading doesn't guarantee that your available CPU cores are used efficiently
See Green Threads vs. native threads - some languages (depending on their threading implementation) can schedule threads across CPUs.
Threading doesn't necessarily mean your threads wind up getting run in multiple physical CPU cores - in fact in many cases they definitely won't. If all your app's threads run on the same physical core, then they aren't truly running in parallel - they are just splitting CPU time in a way that may make them look like they are running in parallel.
For these reasons, depending on the structure of your app, it's often less complicated to send background tasks to a separate worker process (process, not thread), which can easily be scheduled onto available CPU cores at the OS level. Separate processes (as opposed to separate threads) also remove a lot of the scheduling concerns within your app, because you essentially offload the decision about how to schedule things onto the OS itself.
This last point is pretty important. OS schedulers are extremely likely to be smarter and more efficiently designed than whatever algorithm you might come up with in your app.

Find memory leak in very complex Ruby app

everyone!
It's nice to work with Ruby and write some code. But in past of this week, i notice that we have some problem in our application. Memory usage is growing like O(x*3) function.
Our application very complex, it is based on EventMachine and other external libs. Even more, it is running under amd64 bit version of FreeBSD using Ruby 1.8.7-p382
I'v tried to research by myself the way how find memory leak in our app.
I've found many tools and libs, but they doesn't work under FreeBSD'64bit and I have no idea how step up to find leaks in huge ruby application. It's OK, if you have few files with 200-300 lines of code, but here you have around 30 files with average 200-300 line's of code.
I just realize, i need too much of time to find those leaks, doing stupid actions: believe/research/assume that some of part of this code is may be actually leaking and wrap some tracking code, like using ruby-prof gem technice. But it's so painfully slow way, because as i said we have too much of code.
So, my question is how to find memory leak in very complex Ruby app and not put all my life into this work?
Thx in advance
One thing to try, even though it can massively degrade performance, is to manually trigger the garbage collector by calling GC.start every so often. How often is kind of subjective, as the more you run it the slower the app, and the less you run it the higher the memory footprint.
For whatever reason, the garbage collector may go on vacation from time to time, presumably not wanting to interfere if there is some heavy processing going on. As such you may have to manually call to have your trash taken away.
One way to avoid creating trash is to use memory more efficiently. Don't create hashes when arrays will do the job, don't create arrays when a single string will suffice, and so on. It will be important to profile your application to see what kind of objects are cluttering up your heap before you just start hacking away randomly.
If you can, try and use 1.9.2 which has made significant gains in terms of memory management. Ruby Enterprise Edition is also an option if you need 1.8.7 compatibility, as it's essentially a better garbage collector for that version.
How hard would it be to run your app on a linux box? If you don't have the same memory problems there, it is probably something specific with your ruby runtime. If you do have the same problems, you can use all the tools and libs that are linux only.
Another alternative - can you wrap your unit tests with some memory tracking code? Most unit test frameworks make it easy to add some code before/after each test. Or you could just run each test 1000000000 times and see if the memory goes out of control? if it does, you know something that happens in that test is causing the leak, and you can continue to isolate the problem.
Have you tried counting the number of objects you have, using ObjectSpace.each_object? Although you're intending to use small batches, maybe you only have more objects that you think.
count = ObjectSpace.each_object() {}
# => 7216

Should I use multiple threads in this situation? [Ruby]

I'm opening multiple files and processing them, one line at a time. The files contain tokens separating the data, such that sometimes the processing of one file may have to wait for others to catch up to that same token.
I was doing this initially with only one thread and an array indicating with true/false if the file should be read in the current iteration or if it should wait for some of the others to catch up.
Would using threads make this simpler? More efficient? Does Ruby have a mechanism for this?
Firstly, Threads never make anything simpler. Threading is only applicable for helping to speed up applications. Threading introduces a host of new complications, it may seem handy to be able to describe multiple threads of execution but it always makes life harder.
Secondly, premature optimization is the root of all evil. Do not attempt to speed up the file processing unless you know that it is a bottleneck. Do the simplest thing that could possibly work (but no simpler).
Thirdly, threading might help if the process of reading the files was independent so that thread can process a file without worrying about what the other threads are doing. It sounds like this is not true in your case. Since the different threads would have to communicate with each other you are unlikely to see a speed benefit in applying threads.
Fourthly, I don't know Ruby and therefore can't comment on what mechanisms it has.
I'm not sure if using threads in ruby is beneficial. Recently I've written and tested an application which was supposed to do parallel computations, but I didn't get what I expected even on quad core processor, it performed computations sequentially, one thread after another. Read this article, it has discussion about threads scheduling, it may turn out that things haven't changed at least for original ruby.

Resources