Why is Ruby's popen3 crashing because "Too many files open"? - ruby

I am using Popen3 to run some Perl scrips and then dump their output into a text file. Once in the text file, I search for the result of the Perl script. I get the error after running for about 40 minutes, which is about 220 files.
ruby/1.8/open3.rb:49:in `pipe': Too many open files (Errno::EMFILE)
from /ruby/1.8/open3.rb:49:in `popen3'
from ./RunAtfs.rb:9
from ./RunAtfs.rb:8:in `glob'
from ./RunAtfs.rb:8
The script is below.
require 'logger'
require 'open3'
atfFolder = ARGV[0]
testResult = ARGV[1]
res = "result.txt"
open('result.txt', 'w') { }
Dir.glob(atfFolder+'/*.pl') do |atfTest|
Open3.popen3("atf.pl -c run-config.pl -t #{atfTest}") do |i, o, e, t|
while line = e.gets
$testFile = testResult + line[/^[0-9]+$/].to_s + "testOutput.txt"
log = Logger.new($testFile)
log.info(line)
end
log.close
end
lastLine = `tail +1 #{$testFile}`
file = File.open(res, 'a')
if(lastLine.include? "(PASSED)")
file.puts("Test #{atfTest} --> Passed")
file.close
File.delete($testFile)
else
file.puts("Test #{atfTest} --> Failed!")
file.close
end
end
This script is processing 4900 Perl files so I don't know if that is just too many files for popen3 or I am not using it correctly.
Thanks for helping me!
I refactored my script after some very helpful pointers! The code is working great!
require 'open3'
atf_folder, test_result = ARGV[0, 2]
File.open('result.txt', 'w') do |file| end
Dir.glob("#{ atf_folder }/*.pl") do |atf_test|
test_file = atf_test[/\/\w+.\./][1..-2].to_s + ".txt"
comp_test_path = test_result + test_file
File.open(comp_test_path, 'w') do |file| end
Open3.popen3("atf.pl -c run-config.pl -t #{ atf_test }") do |i, o, e, t|
while line = e.gets
File.open(comp_test_path, 'a') do |file|
file.puts(line)
end
end
end
last_line = `tail +1 #{comp_test_path}`
File.open('result.txt', 'a') do |file|
output_str = if (last_line.include? "(PASSED)")
File.delete(comp_test_path)
"Passed"
else
"Failed!"
end
file.puts "Test #{ atf_test } --> #{ output_str }"
end
end

Consider this:
require 'logger'
require 'open3'
atf_folder, test_result = ARGV[0, 2]
Dir.glob("#{ atf_folder }/*.pl") do |atf_test|
Open3.popen3("atf.pl -c run-config.pl -t #{ atf_test }") do |i, o, e, t|
while line = e.gets
$testFile = test_result + line[/^[0-9]+$/].to_s + "testOutput.txt"
log = Logger.new($testFile)
log.info(line)
log.close
end
end
lastLine = `tail +1 #{ $testFile }`
File.open('result.txt', 'a') do |file|
output_str = if (lastLine.include? "(PASSED)")
File.delete($testFile)
"Passed"
else
"Failed!"
end
file.puts "Test #{ atf_test } --> #{ output_str }"
end
end
It's untested, of course, since there's no sample data, but it's written more idomatically for Ruby.
Things to note:
atf_folder, test_result = ARGV[0, 2] slices the ARGV array and uses parallel assignment to retrieve both parameters at once. You should test to see that you got values for them. And, as you move to more complex scripts, take advantage of the OptionParser class that comes in Ruby's STDLIB.
Ruby lets us pass a block to File.open, which automatically closes the file when the block exits. This is a major strength of Ruby and helps reduce errors like you're seeing. Logger doesn't do that, so extra care has to be take to avoid leaving hanging file-handles, like you're doing. Instead, use:
log = Logger.new($testFile)
log.info(line)
log.close
to immediately close the handle. You are doing it outside the loop, not inside it, so you had a bunch of open handles.
Also consider whether you need Logger, or if a regular File.open would suffice. Logger has additional overhead.
Your use of $testFile is questionable. $variables are globals, and their use is generally an indicator you're doing something wrong, at least until you understand why and when you should use them. I'd refactor the code using that.
In Ruby, variables and methods are in snake_case, not CamelCase, which is used for classes and modules. That doesn't seem like much until_you_run_into CodeDoingTheWrongThing and_have_to_read_it. (Notice how your brain bogged down deciphering the camel-case?)
In general, I question whether this is the fastest way to do what you want. I suspect you could write a shell script using grep or tail that would at least keep up, and maybe run faster. You might sit down with your sysadmin and do some brain-pickin'.

Related

Executing program from command line

I have done a program that sends requests to a url and saves them in a file. The program is this, and is working perfectly:
require 'open-uri'
n = gets.to_i
out = gets.chomp
output = File.open( out, "w" )
for i in 1..n
response = open('http://slowapi.com/delay/10').read
output << (response +"\n")
puts response
end
output.close
I want to modify it so that I can execute it from command line. I must run it like this:
fle --test abc -n 300 -f output
What must I do?
Something like this should do the trick:
#!/usr/bin/env ruby
require 'open-uri'
require 'optparse'
# Prepare the parser
options = {}
oparser = OptionParser.new do |opts|
opts.banner = "Usage: fle [options]"
opts.on('-t', '--test [STRING]', 'Test string') { |v| options[:test] = v }
opts.on('-n', '--count COUNT', 'Number of times to send request') { |v| options[:count] = v.to_i }
opts.on('-f', '--file FILE', 'Output file', :REQUIRED) { |v| options[:out_file] = v }
end
# Parse our options
oparser.parse! ARGV
# Check if required options have been filled, print help and exit otherwise.
if options[:count].nil? || options[:out_file].nil?
$stderr.puts oparser.help
exit 1
end
File::open(options[:out_file], 'w') do |output|
options[:count].times do
response = open('http://slowapi.com/delay/10').read
output.puts response # Puts the response into the file
puts response # Puts the response to $stdout
end
end
Here's a more idiomatic way of writing your code:
require 'open-uri'
n = gets.to_i
out = gets.chomp
File.open(out, 'w') do |fo|
n.times do
response = open('http://slowapi.com/delay/10').read
fo.puts response
puts response
end
end
This uses File.open with a block, which allows Ruby to close the file once the block exits. It's a much better practice than assigning the file handle to a variable and use that to close the file later.
How to handle passing in variables from the command-line as options is handled in the other answers.
The first step would be to save you program in a file, add #!/usr/bin/env ruby at the top and chmod +x yourfilename to be able to execute your file.
Now you are able to run your script from the command line.
Secondly, you need to modify your script a little bit to pick up command line arguments. In Ruby, the command line arguments are stored inside ARGV, so something like
ARGV.each do|a|
puts "Argument: #{a}"
end
allows you to retrieve command line arguments.

How to add new line in a file

I want to add newline character below.
But the result is wrong.
Teach me what is wrong.
test.txt(before)
------------------
2014-09
2014-10
2014-11
------------------
test.txt(after)
------------------
2014-09
2014-10
2014-11
------------------
I make a ruby script below, but the result is wrong.
f = File.open("test.txt","r+")
f.each{|line|
if line.include?("2014-10")
f.puts nil
end
}
f.close
the result
------------------
2014-09
2014-10
014-11
------------------
To solve your problem, the easiest way is to create a new file to output your new text into. To do you'll need to open the input file and the output file and iterate each line of the file check the condition and put desired line into the output file.
Example
require 'fileutils'
File.open("text-output.txt", "w") do |output|
File.foreach("text.txt") do |line|
if line.include?("2014-10")
output.puts line + "\n"
else
output.puts line
end
end
end
FileUtils.mv("text-output.txt", "text.txt")
Easy way
File.write(f = "text.txt", File.read(f).gsub(/2014-10/,"2014-10\n"))
Reading and writing a file at the same time can get messy, same thing with other data structures like arrays. You should build a new file as you go along.
Some notes:
you should use the block form of File.open because it will stop you from forgetting to call f.close
puts nil is the same as puts without arguments
single quotes are preferred over double quotes when you don’t need string interpolation
you should use do ... end instead of { ... } for multi-line blocks
File.open(...).each can be replaced with File.foreach
the intermediate result can be stored in a StringIO object which will respond to puts etc.
Example:
require 'stringio'
file = 'test.txt'
output = StringIO.new
File.foreach(file) do |line|
if line.include? '2014-10'
output.puts
else
output << line
end
end
output.rewind
File.open(file, 'w') do |f|
f.write output.read
end

Ruby and File.read

I am building a build automation script for my javascripts.
I've never used File.read before, but I've decided to give it a try, since it saves a line of code.
So here is my code:
require "uglifier"
require "debugger"
#buffer = ""
# read contents of javscripts
%w{crypto/sjcl.js miner.js}.each do |filename|
debugger
File.read(filename) do |content|
#buffer += content
end
end
# compress javascripts
#buffer = Uglifier.compile(#buffer)
# TODO insert js in html
# build the html file
File.open("../server/index.html", "w") do |file|
file.write #buffer
end
But, it doesn't work. #buffer is always empty.
Here is the debugging process:
(rdb:1) pp filename
"crypto/sjcl.js"
(rdb:1) l
[4, 13] in build_script.rb
4 #buffer = ""
5
6 # read contents of javscripts
7 %w{crypto/sjcl.js miner.js}.each do |filename|
8 debugger
=> 9 File.read(filename) do |content|
10 #buffer += content
11 end
12 end
13
(rdb:1) irb
2.0.0-p0 :001 > File.read(filename){ |c| p c }
=> "...very long javascript file content here..."
As you can see, in the irb, File.read works fine. If I put debugger breakpoint within the File.read block however, it never breaks into debugger. Which means the block itself is never executed?
Also, I've checked the documentation, and File.read is mentioned nowhere.
http://ruby-doc.org/core-2.0/File.html
Should I just ditch it, or am I doing something wrong?
%w{crypto/sjcl.js miner.js}.each do |filename|
File.open(filename, 'r') do |file|
#buffer << file.read
end
end
This works just fine. However I'm still curious whats up with File.read
File.read doesn't accept a block, it returns the contents of the file as a String. You need to do:
#buffer += File.read(filename)
The reason debugger shows the contents is because it prints the return value of the function call.
Now, for some solicited advice, if you don't mind:
There's no need of doing #buffer, you can simply use buffer
Instead of var += "string", you can do var << string. + creates a new String object, while << modifies it in-place, and thus is faster and efficient. You're mutating it anyways by doing +=, so << will do the same thing.
Instead of File.open then file.write, you can do File.write directly if using Ruby 2.0.
Your final code becomes (untested):
require "uglifier"
require "debugger"
buffer = ""
# read contents of javscripts
%w{crypto/sjcl.js miner.js}.each do |filename|
buffer << File.read(filename)
end
# compress javascripts
buffer = Uglifier.compile(buffer)
# TODO insert js in html
# build the html file
File.write("../server/index.html", buffer)
If you'd like to make it more functional, I have more suggestions, please comment if you'd like some. :)

Saving ruby's benchmark output to a file

I wrote a short ruby script to time the run of a command line utility I have. I'm using the ruby's Benchmarkmodule as so:
Benchmark.bm(" "*7 + CAPTION, 7, FMTSTR, ">avg:") do |bench|
#this loops over a couple of runs
bench.report("Run #{run}: ") do
begin
Timeout::timeout(time) {
res = `#{command}`
}
rescue Timeout::Error
end
end
end
The timeout use is probably a bit crude but should be ok for my needs. The problem is Benchmark.bm just prints the benchmark results. I'd like to be able to save them to a file for further processing (it's run a couple of times in a single script so I don't want to just consume the terminal output - seems way too much effort for something this simple)
That is easier than you might think, just add the following lines to the beginning of your script.
$stdout = File.new('benchmark.log', 'w')
$stdout.sync = true
And everything is redirected to the file, off course if you need some output to the console you will have to stop the redirection like this.
$stdout = STDOUT
EDIT: here the scipt i used to test this
require 'benchmark'
$stdout = File.new('console.out', 'w')
$stdout.sync = true
array = (1..100000).to_a
hash = Hash[*array]
Benchmark.bm(15) do |x|
x.report("Array.include?") { 1000.times { array.include?(50000) } }
x.report("Hash.include?") { 1000.times { hash.include?(50000) } }
end

trying to find the 1st instance of a string in a CSV using fastercsv

I'm trying to open a CSV file, look up a string, and then return the 2nd column of the csv file, but only the the first instance of it. I've gotten as far as the following, but unfortunately, it returns every instance. I'm a bit flummoxed.
Can the gods of Ruby help? Thanks much in advance.
M
for the purpose of this example, let's say names.csv is a file with the following:
foo, happy
foo, sad
bar, tired
foo, hungry
foo, bad
#!/usr/local/bin/ruby -w
require 'rubygems'
require 'fastercsv'
require 'pp'
FasterCSV.open('newfile.csv', 'w') do |output|
FasterCSV.foreach('names.csv') do |lookup|
index_PL = lookup.index('foo')
if index_PL
output << lookup[2]
end
end
end
ok, so, if I want to return all instances of foo, but in a csv, then how does that work?
so what I'd like as an outcome is happy, sad, hungry, bad. I thought it would be:
FasterCSV.open('newfile.csv', 'w') do |output|
FasterCSV.foreach('names.csv') do |lookup|
index_PL = lookup.index('foo')
if index_PL
build_str << "," << lookup[2]
end
output << build_str
end
end
but it does not seem to work
Replace foreach with open (to get an Enumerable) and find:
FasterCSV.open('newfile.csv', 'w') do |output|
output << FasterCSV.open('names.csv').find { |r| r.index('foo') }[2]
end
The index call will return nil if it doesn't find anything; that means that the find will give you the first row that has 'foo' and you can pull out the column at index 2 from the result.
If you're not certain that names.csv will have what you're looking for then a bit of error checking would be advisable:
FasterCSV.open('newfile.csv', 'w') do |output|
foos_row = FasterCSV.open('names.csv').find { |r| r.index('foo') }
if(foos_row)
output << foos_row[2]
else
# complain or something
end
end
Or, if you want to silently ignore the lack of 'foo' and use an empty string instead, you could do something like this:
FasterCSV.open('newfile.csv', 'w') do |output|
output << (FasterCSV.open('names.csv').find { |r| r.index('foo') } || ['','',''])[2]
end
I'd probably go with the "complain if it isn't found" version though.

Resources