Is it possible to ignore irrelevant methods when profiling ruby applications? - ruby

While using ruby-prof, printed out in graph-html mode, the report for one method says (with some snipping)
%Total %Self Total Self Wait Child Calls Name Line
52.85% 0.00% 51.22 0.00 0.00 51.22 1 ClassName#method_name 42
51.22 0.00 0.00 51.22 1/3 Hash#each 4200
Obviously, it's not Hash#each that's taking a long time, but the yield block within Hash#each.
Looking at the report for Hash#each is confusing because it reports on all of the code called by anything that uses Hash#each.
Is it possible to ask ruby-prof to put the information on yielded code in ClassName#method_name's report?
Using min_percent or switching to a flat profile doesn't seem to help.

Version 0.9.0 of ruby-prof allows method elimination. For example, to eliminate Integer#times, use
result = RubyProf.stop
result.eliminate_methods!([/Integer#times/])
so that
def method_a
5.times {method_b}
end
will indicate the relationship between method_a and method_b directly.

If you don't mind low-tech, maybe you want to consider this. All you need is to be able to pause the debugger. Guaranteed, it will quickly find anything you can find any other way, and not show you any irrelevant code.

Related

How can 'behave' be used to test time constraints?

I couldn't work out if Stack Overflow is an appropriate site to ask this and if not then what the appropriate site would be! There are so many stack exchange sites now and I didn't want to go through 200 sites :S
When I try and test whether my functions run within X seconds using behave (ie gherkin feature files and behave test steps), the code takes longer to run with behave testing than it would on its own. Especially at the beginning of the test but also in other parts.
Has anybody tested time constraints with behave before and know a workaround to adjust for the extra time that behave adds?
Is this even possible?
EDIT: To show how I'm timing the tests:
#when("the Python script provides images to the model")
def step_impl(context):
context.x_second_requirement = 5
# TODO: Investigate why this takes so long, when I'm not using behave I can use a 0.8 second timing constraint
context.start_time = time.time()
context.car_brain.tick()
context.end_time = time.time()
#then("the model must not take more than X seconds to produce output")
def step_impl(context):
assert context.end_time - context.start_time < context.x_second_requirement
Cheers,
Milan

Find the Run Time of Select Ruby Code

Problem
Howdy guys, so I want to find the run time of a block of code in Ruby, but I am not entirely sure as to how I could do it. I want to run some code, and then output how long it took to run that code because I have a super huge program and the run time changes a lot. I want to make sure it always has a consistent run time (I could do it by sleeping it for a fraction of a second) but that isn't my problem. I want to find out how long the run time actually is so the program can know if it needs to slow things down or speed things up.
My Thoughts
So, I have an idea as to how it could work. I have never used Time in ruby but I have an idea as to how I could use that. I could have a variable equal to the time (in milliseconds) and then another variable that I make at the end of the code block that does it again, and then I just subtract them, but I have (1) never used Time and (2) I don't actually know if that is the best way.
Thanks in advance!
Ruby has the Benchmark module for timing how long things take. I've never used this outside of seeing if a method is taking too long to run, etc. in development, not sure if this is 'recommended' for production code or for keeping things above a minimum runtime (as it sounds like you might be doing), but take a look and see how it feels for your use case.
It also sounds like you might be interested in the Timeout module as well (for making sure things don't take longer than a set amount of time).
If you really have a use case for making sure something takes a minimum amount of time, timing the code (either using a Benchmark method or just Time or another solution) and then sleep the difference is the only thing that comes to mind.
It is simple. Look at your watch (Time.now) and remember the time, run the code, look at your watch again, subtract.
t0 = Time.now
# your block of code
puts Time.now - t0
[http://ruby-doc.org/core-1.9.3/Time.html
You want to to use the Time object. (Time Docs)
For example,
start = Time.now
# code to time
finish = Time.now
diff = finish - start
diff would be in seconds, as a floating point number.
EDIT: end is reserved.
or you can use
require 'benchmark'
def foo
time = Benchmark.measure {
code to test
}
puts time.real #or save it to logs
end
Sample output:
2.2.3 :001 > foo
5.230000 0.020000 5.250000 ( 5.274806)
Values are CPU time, system time, total and real elapsed time.
[http://ruby-doc.org/stdlib-2.0.0/libdoc/benchmark/rdoc/Benchmark.html#method-c-bm
Source: Ruby docs.

Multi-threading in Ruby (MRI)

According to GIL implementation in Ruby (MRI), the code below must fail by printing a message more than one time. But it doesn't, it always print it one time:
class Sheep
def initialize
#shorn = false
end
def shorn?
#shorn
end
def shorn!
puts "shearing..."
#shorn = true
end
end
s = Sheep.new
55.times.map do
Thread.new { s.shorn! unless s.shorn? }
end.each(&:join)
How come?
$ ruby --version
ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0]
It depends a bit on which exact ruby version you use (which differ in the way they schedule threads). On my system it depends a bit on the overall system load and how fast the terminal feels, but on Ruby 2.0.00p481 I get between 1 and 55 lines of output, on Ruby 1.8.7, I consistently get only one line.
It should be noted here that Ruby 2.0 and higher uses actual OS threads (albeit still with a GIL) while Ruby 1.8 uses internal green threads with its own scheduling. It might be very well possible that older ruby versions schedule threads more granular.
In any case, you should not rely on any incidentally thread scheduling behavior. This is not part of any documented behavior and things will change on different systems as as Ruby matures. You should always ensure that you use shared data structures safely when using threads.
I use Ruby version ruby 2.1.5p273 and I suppose your slightly different Ruby version should yield similar results.
I have different results every time I run the program.
I tried with one core enabled and fore cores enabled. I don't see a difference. It is not thread safe, as you expected.
Otherwise the only answer I can come up with is that your program is too fast/lightweight, so that the interpreter does not think of thread switching too often.
I have only one suggestion in this case. A trick you could use to give the interpreter a hint that maybe she could switch threads. You could use the sleep function.
In your example I would put it just before the race condition:
def shorn!
sleep 0.0001
puts "shearing..."
#shorn = true
end
If you'd like to have more info about the GIL I can recommend Jesse Storimer's Nobody understands the GIL
If you'd like to read more about Ruby and concurrency I can recommend Dotan Nahum's Pragmatic Concurrency with Ruby
The trick I suggested was mentioned in this answer
As others have mentioned, the GIL's behavior is not documented and is totally implementation-dependent. You shouldn't rely on any expectations about its scheduling behavior.
A more detailed (and also more general) answer, however, is that the scheduler switches execution between threads to make sure that no single thread blocks the process. This switch is called a context switch or more specifically a thread switch.
When the context switch occurs, the current thread's execution is paused and another thread's execution is resumed. If it's a brand new thread that's being "resumed," then it means that the new thread's execution starts from the beginning.
In the case of your program, each new thread begins with
s.shorn?
as it evaluates unless s.shorn?. At this point, #shorn == false and s.shorn? evaluates to false. So then the thread runs:
s.shorn!
The first command in #shorn! that gets run is:
puts "shearing..."
What happens next depends on the thread scheduler:
If the scheduler decides to let the current thread continue executing, then the next command that gets executed is #shorn = true. Then the thread ends, the scheduler starts the next thread, unless s.shorn? evaluates to true, and the thread stops. This behavior repeats in a loop until there are no more threads left.
If the scheduler decides to switch to another thread, then it will pause execution right before #shorn = true and start running the same code as before from the beginning. That means that #shorn == false when the new thread starts, and so puts "shearing..." will execute again.
As you can see, it all depends on when the scheduler decides to perform a context switch.
But what about the GIL?
The GIL is a horribly misunderstood part of MRI Ruby. There are plenty of resources out there to explain how the GIL works, but in this case the most important thing that you should know is that the GIL doesn't guarantee that each thread will run sequentially.
Instead, the GIL merely guarantees that most core Ruby methods that are implemented in C (for example, Array#<<) won't be interrupted by a context switch until they are finished. In the case of puts "shearing...", I haven't looked at the code for puts, but probably the GIL guarantees that no other thread will run until the currently running thread finishes executing puts.
As for why when you ran your code under MRI 1.8.7 it only displayed shearing... once, that doesn't necessarily have anything to do with green vs. native threads. The better answer is that it was a coincidence. The more precise answer is that in your case, for some reason the scheduler decided to interrupt the first thread after running #shorn = true. This behavior may possibly have been due to green threads in the sense that maybe your native scheduler interrupts more frequently than Ruby's scheduler (hence the "more granular" suggestion in one of the answers below), but that's not necessarily true. It could also have been a fluke.
Multithreading in Ruby is really easy to mess up. Hence why Matz recommends sticking to forking processes, which is memory-inefficient but removes the burden of managing threads. Another approach for larger projects would be to use a library like Celluloid, which abstracts away Ruby's thread safety mechanisms. For a small example like this, however, a simple mutex would do:
semaphore = Mutex.new
s = Sheep.new
55.times.map {
Thread.new {
semaphore.synchronize do
s.shorn! unless s.shorn?
end
}
}.each(&:join)

how to produce delay in ruby

How to produce delay in ruby?
I used sleep statement but it didn't give me what I want.
puts "amit"
sleep(10)
puts "scj"
I want it to first print amit, then a delay of 10 seconds, then print scj.
But in above case what happens is it will pause for 10 seconds and then it will print amit and scj together. I don't want that.
I hope you got what I want to say.
I can't reproduce this. From a console, this does exactly what you'd expect:
puts "amit"
sleep 10
puts "scj"
(Ruby 1.8.6 on Linux)
Can you provide a similar short but complete example which doesn't do what you want - or explain your context more?
If you're writing a web application, then the browser may well only see any data once the whole response has been written - that would explain what you're seeing. If that's the case, you'll need a different approach which would allow the initial response to be written first, and then make the browser make another request. The delay could be at the server or the client, depending no the scenario.
Call $stdout.flush before the call to sleep. The output is probably buffered (although usually output is only line-buffered so puts, which produces a newline, should work without flushing, but apparently that's not true for your terminal).

Is there a simple method for checking whether a Ruby IO instance will block on read()?

I'm looking for a method in Ruby which is basically this:
io.ready_for_read?
I just want to check whether a given IO object (in my case, the result of a popen call) has output available, i.e. a follow up call io.read(1) will not block.
These are the two options I see, neither of which I like:
io.read_nonblock - too thin an abstraction of Unix read() -- I don't want to deal with errno error handling.
io.select with timeout 0 -- obfuscates the purpose of this simple operation.
Is there a better alternative that I have overlooked?
A bit late, but if you require 'io/wait', you can use ready? to verify that the IO can be read without blocking. Granted, depending upon how much you intend on reading (and how you plan to do it) your IO object may still block, but this should help. I'm not sure if this library is supported on all platforms, and I also don't know why this functionality was separated from the rest of the IO library. See more here: http://ruby-doc.org/stdlib/libdoc/io/wait/rdoc/
I'm ready to conclude that no, there is no simple method to do this. Per Peter Cooper's suggestion, here is IO#ready_for_read?:
class IO
def ready_for_read?
result = IO.select([self], nil, nil, 0)
result && (result.first.first == self)
end
end
On Windows I've seen some inconsistencies with IO/wait. The ruby I have here right now is:
ruby 1.9.2p136 (2010-12-25) [i386-mingw32]
On this one both nread and ready? are implemented, but they return erroneous results. On a different version that I was using ready? was still broken and nread didn't even exist.
One possibility is to use io.stat.size, which tells you the number of bytes available to read in an IO stream.
http://www.ruby-doc.org/core/classes/File/Stat.html
The documentation suggests that it's for files, but I've used it on pipes connected to a separate process (via Ruby's Open3.popen3). It's worked for me so far.

Resources