run external program in Ruby IO.popen : rescue not working - ruby

I'm using the Tika jar to extract metadata from Microsoft Word doc files but in the case Tika encounters a problem my rescue is not catching the error, instead the scripts exits. I'm on windows 7 with MRI Ruby 1.9.3
I could adapt the doc file but I want to avoid having this problem with future files.
How can I capture this error ?
JARPATH = "jar/tika-app-1.6.jar"
def metadata
return #metadata if defined? #metadata
switch = '-m -j'
begin
command = %Q{java -Djava.awt.headless=true -jar #{JARPATH} #{switch} "#{#path}"}
output = IO.popen(command+" 2>&1") do |io|
io.read
end
if output.respond_to?(:to_str)
#metadata = JSON.parse(output)
else
#metadata = nil
end
rescue => e
puts e
puts e.backtrace
end
end
This is the output I get
c:/Ruby193/lib/ruby/gems/1.9.1/gems/json-1.8.2/lib/json/common.rb:155:in `parse': 757: unexpected token at 'Exception in thread "main" org.apache.tika.exception.TikaException: TIKA-198: Illegal IOExce
ption from org.apache.tika.parser.microsoft.OfficeParser#1006d75 (JSON::ParserError)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:250)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:121)
at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:143)
at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:422)
at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:113)
Caused by: java.io.IOException: Invalid header signature; read 0x04090000002DA5DB, expected 0xE11AB1A1E011CFD0 - Your file appears not to be a valid OLE2 document
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:140)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:115)
at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:204)
at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:163)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:162)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
... 5 more
'
from c:/Ruby193/lib/ruby/gems/1.9.1/gems/json-1.8.2/lib/json/common.rb:155:in `parse'
from C:/Users/.../tika.rb:37:in `metadata'
from C:/Users/.../index_helpers.rb:55:in `index_doc'
from index.rb:39:in `block in <main>'
from index.rb:20:in `each'
from index.rb:20:in `each_with_index'
from index.rb:20:in `<main>'

After your call to IO.popen you are passing the output from the child program to JSON.parse, regardless of whether it is valid. The exception you see is the json parser trying to parse the Java exception method, which is captured because you redirect stderr with 2>&1.
You need to check that the child process completed successfully before continuing. The simplest way is probably to use the $? special variable, which indicates the status of the last executed child process, after the call to popen. This variable is an instance if Process::Status. You could do something like this:
output = IO.popen(command+" 2>&1") do |io|
io.read
end
unless $?.success?
# Handle the error however you feel is best, e.g.
puts "Tika had an error, the message was:\n#{output}"
raise "Tika error"
end
For more control you could look at the Open3 module in the standard library. Since Tika is a Java program, another possibility might be to look into using JRuby and call it directly.

Related

Using rescue and ensure in the middle of code

Still new to Ruby - I've had a look at some of the answers to seemingly similar questions but, to be honest, I couldn't get my head around them.
I have some code that reads a .csv file. The data is split into groups of 40-50 rows per user record and validates data in the rows against a database accessed via a website.
A login is required for each record, but once that user has logged in each row in the .csv file can be checked until the next user is reached, at which point the user logs out.
All that's working, however, if an error occurs (e.g. a different result on the website than the expected result on the .csv file) the program stops.
I need something that will
a) at tell me which line on the file the error occurred
b) log the row to be output when it's finished running, and
iii) restart the program from the next line in the .csv file
The code I have so far is below
Thanks in advance,
Peter
require 'csv-mapper'
loginrequired = true
Given(/^I compare the User Details from rows "(.*?)" to "(.*?)"$/) do |firstrow, lastrow|
data = CsvMapper.import('C:/auto_test_data/User Data csv.csv') do
[dln, nino, pcode, endor_cd, ct_cd]
end
#Row number changed because Excel starts at 'row 1' and Ruby starts counting at 'row 0'
(firstrow.to_i-1..lastrow.to_i-1).each do |row|
#licnum1 = data.at(row).dln
#licnum2 = data.at(row+1).dln
#nino = data.at(row).nino
#postcode = data.at(row).pcode
#endor_cd = data.at(row).endor_cd
#ct_cd = data.at(row).ct_cd
#Login only required once for each new user-account
if
loginrequired == true
logon_to_vdr #def for this is in hooks
click_on 'P and D'
loginrequired = false
end
#This is the check against the database and is required for every line in the .csv file
check_ctcd #def for this is in hooks
#Need something in here to log errors and move on to the next line in the .csv file
#Compare the ID for the next record and logout if they're different
if #licnum1 == #licnum2
loginrequired = false
else
loginrequired = true`enter code here`
click_on 'Logout'
end
end
end
It seems like you need some error logging since you apparently don't know what type of error you're receiving or where. If this script is standalone you can redirect $stderr to file so that you can read what went wrong.
# put this line at the top of your script
$stderr = File.open("/path/to/your/logfile.log","a")
When an error occurs, ruby will automatically write the error message, class, and backtrace to the log file you specify so that you can trace back the line where things are not going as expected. (When you run a script from the command line, normally this information will just get blurted back to the terminal when an error happens.)
For example, on my desktop I created a file log_stderr.rb with the following (line numbers included):
1 $stderr = File.open("C:/Users/me/Desktop/my_log.log","w")
2
3 #require a file which will raise an error to see the backtrace
4 require_relative 'raise_error.rb'
5
6 puts "code that will never be reached"
Also on my desktop I created the file raise_error.rb with the following (to deepen the backtrace for better example output):
1 # call raise to generate an error arbitrarily
2 # to halt execution and exit the program.
3 raise RuntimeError, 'the program stopped working!'
When I run ruby log_stderr.rb from the command line, my_log.log is created on my desktop with the following:
C:/Users/me/Desktop/raise_error.rb:3:in `<top (required)>': the program stopped working! (RuntimeError)
from C:/Users/me/Desktop/log_stderr.rb:4:in `require_relative'
from C:/Users/me/Desktop/log_stderr.rb:4:in `<main>'
If you are working in a larger environment where your script is being called amidst other scripts then you probably do not want to redirect $stderr because this would affect everything else running in the environment. ($stderr is global as indicated by the $ variable prefix.) If this is the case you would want to implement a begin; rescue; end structure and also make your own logfile so that you don't affect $stderr.
Again, since you don't know where the error is happening you want to wrap the whole script with begin; end
# at the very top of the script, begin watching for weirdness
begin
logfile = File.open("/path/to/your/logfile.log", "w")
require 'csv-mapper'
#. . .
# rescue and end at the very bottom to capture any errors that have happened
rescue => e
# capture details about the error in your logfile
logfile.puts "ERROR:", e.class, e.message, e.backtrace
# pass the error along since you don't know what it is
# and there may have been a very good reason to stop the program
raise e
end
If you find that your error is happening only in the block (firstrow.to_i-1..lastrow.to_i-1).each do |row| you can place the begin; end inside of this block to have access to the local row variable, or else create a top level variable independent of the block and assign it during each iteration of the block to report to your logfile:
begin
logfile = File.open("/path/to/your/logfile.log", "w")
csv_row = "before csv"
#. . .
(firstrow.to_i-1..lastrow.to_i-1).each do |row|
csv_row = row
#. . .
end
csv_row = "after csv"
rescue => e
logfile.puts "ERROR AT ROW: #{csv_row}", e.class, e.message, e.backtrace
raise e
end
I hope this helps!
It doesn't seem like you need to rescue exception here. But what you could do is in your check_ctcd method, raise error if records doesn't match. Then you can rescue from it. In order to know which line it is, in your iteration, you could use #each_with_index and log the index when things go wrong.
(firstrow.to_i-1..lastrow.to_i-1).each_with_index do |row, i|
#licnum1 = data.at(row).dln
#licnum2 = data.at(row+1).dln
#nino = data.at(row).nino
#postcode = data.at(row).pcode
#endor_cd = data.at(row).endor_cd
#ct_cd = data.at(row).ct_cd
#Login only required once for each new user-account
if
loginrequired == true
logon_to_vdr #def for this is in hooks
click_on 'P and D'
loginrequired = false
end
#This is the check against the database and is required for every line in the .csv file
check_ctcd #def for this is in hooks
rescue => e
# log the error and index here
...
And you can make your own custom error, and rescue only the certain type so that you don't silently rescue other errors.

Catch errors in vim/ruby

I'm writing a vim plugin using the ruby interface.
When I execute VIM::command(...), how can I detect if vim raised an error during execution of this command, so that I can skip further commands and also present a better message to the user?
Vim's global variable v:errmsg will give you the last error. If you want to check whether an error occured, you can first set it to an empty string and then check for it:
let v:errmsg = ""
" issue your command
if v:errmsg != ""
" handle the error
endif;
I'll leave it up to you to transfer this to the Ruby API. Also see :h v:errmsg from inside Vim. Other useful global variables may be:
v:exception
v:throwpoint
Edit – this should work (caution: some magic involved):
module VIM
class Error < StandardError; end
class << self
def command_with_error *args
command('let v:errmsg=""')
command(*args)
msg = evaluate('v:errmsg')
raise ::VIM::Error, msg unless msg.empty?
end
end
end
# Usage
# use sil[ent]! or the error will bubble up to Vim
begin
VIM::command_with_error('sil! foobar')
rescue VIM::Error => e
puts 'Rescued from: ' + e.message;
end
# Output
Rescued from: E492: Not an editor command: sil! foobar

Is there any way to recover from Psych::SyntaxError?

When there is a syntax error in an i18n locale YAML file, Psych::SyntaxError is raised. When this exception is encountered during Rails bootup (for example, when production is restarted), Rails crashes.
Is there any way to capture this exception and somehow recover from it without having Rails crash altogether?
Is there any way to check locale files for syntax errors before commit or deploy in an automated way?
I'm not sure if there's a way to recover from this error, but I've created a rake task that ensures that a given YAML file is syntactically valid (run via a pre-commit git hook for any changed YAML files):
namespace :yaml do
desc "Check YAML syntax for errors."
task :check_syntax do
require 'YAML'
require 'colorize'
puts "Checking YAML files..."
filenames = (ENV['FILENAMES'].split(',') || []).push(ENV['FILENAME']).uniq.compact
fails = 0
filenames.each do |file|
print "#{file}... "
begin
YAML.load_file(file)
rescue Psych::SyntaxError => e
fails += 1
print "Failed! ".red
print "[#{e.message.sub(/\A.*: /, '')}]"
puts
next
end
print "Success!".green
puts
end
puts
exit fails > 0 ? 1 : 0
end
end

How to check if IO.popen succeeded? [duplicate]

When I use IO::popen with a non-existent command, I get an error message printed to the screen:
irb> IO.popen "fakefake"
#=> #<IO:0x187dec>
irb> (irb):1: command not found: fakefake
Is there any way I can capture this error, so I can examine from within my script?
Yes: Upgrade to ruby 1.9. If you run that in 1.9, an Errno::ENOENT will be raised instead, and you will be able to rescue it.
(Edit) Here is a hackish way of doing it in 1.8:
error = IO.pipe
$stderr.reopen error[1]
pipe = IO.popen 'qwe' # <- not a real command
$stderr.reopen IO.new(2)
error[1].close
if !select([error[0]], nil, nil, 0.1)
# The command was found. Use `pipe' here.
puts 'found'
else
# The command could not be found.
puts 'not found'
end

How do I get ruby to print a full backtrace instead of a truncated one?

When I get exceptions, it is often from deep within the call stack. When this happens, more often than not, the actual offending line of code is hidden from me:
tmp.rb:7:in `t': undefined method `bar' for nil:NilClass (NoMethodError)
from tmp.rb:10:in `s'
from tmp.rb:13:in `r'
from tmp.rb:16:in `q'
from tmp.rb:19:in `p'
from tmp.rb:22:in `o'
from tmp.rb:25:in `n'
from tmp.rb:28:in `m'
from tmp.rb:31:in `l'
... 8 levels...
from tmp.rb:58:in `c'
from tmp.rb:61:in `b'
from tmp.rb:64:in `a'
from tmp.rb:67
That "... 8 levels..." truncation is causing me a great deal of trouble. I'm not having much success googling for this one: How do I tell ruby that I want dumps to include the full stack?
Exception#backtrace has the entire stack in it:
def do_division_by_zero; 5 / 0; end
begin
do_division_by_zero
rescue => exception
puts exception.backtrace
raise # always reraise
end
(Inspired by Peter Cooper's Ruby Inside blog)
You could also do this if you'd like a simple one-liner:
puts caller
This produces the error description and nice clean, indented stacktrace:
begin
# Some exception throwing code
rescue => e
puts "Error during processing: #{$!}"
puts "Backtrace:\n\t#{e.backtrace.join("\n\t")}"
end
IRB has a setting for this awful "feature", which you can customize.
Create a file called ~/.irbrc that includes the following line:
IRB.conf[:BACK_TRACE_LIMIT] = 100
This will allow you to see 100 stack frames in irb, at least. I haven't been able to find an equivalent setting for the non-interactive runtime.
Detailed information about IRB customization can be found in the Pickaxe book.
One liner for callstack:
begin; Whatever.you.want; rescue => e; puts e.message; puts; puts e.backtrace; end
One liner for callstack without all the gems's:
begin; Whatever.you.want; rescue => e; puts e.message; puts; puts e.backtrace.grep_v(/\/gems\//); end
One liner for callstack without all the gems's and relative to current directory
begin; Whatever.you.want; rescue => e; puts e.message; puts; puts e.backtrace.grep_v(/\/gems\//).map { |l| l.gsub(`pwd`.strip + '/', '') }; end
This mimics the official Ruby trace, if that's important to you.
begin
0/0 # or some other nonsense
rescue => e
puts e.backtrace.join("\n\t")
.sub("\n\t", ": #{e}#{e.class ? " (#{e.class})" : ''}\n\t")
end
Amusingly, it doesn't handle 'unhandled exception' properly, reporting it as 'RuntimeError', but the location is correct.
Almost everybody answered this. My version of printing any rails exception into logs would be:
begin
some_statement
rescue => e
puts "Exception Occurred #{e}. Message: #{e.message}. Backtrace: \n #{e.backtrace.join("\n")}"
Rails.logger.error "Exception Occurred #{e}. Message: #{e.message}. Backtrace: \n #{e.backtrace.join("\n")}"
end
I was getting these errors when trying to load my test environment (via rake test or autotest) and the IRB suggestions didn't help. I ended up wrapping my entire test/test_helper.rb in a begin/rescue block and that fixed things up.
begin
class ActiveSupport::TestCase
#awesome stuff
end
rescue => e
puts e.backtrace
end
[examine all threads backtraces to find the culprit]
Even fully expanded call stack can still hide the actual offending line of code from you when you use more than one thread!
Example: One thread is iterating ruby Hash, other thread is trying to modify it. BOOM! Exception! And the problem with the stack trace you get while trying to modify 'busy' hash is that it shows you chain of functions down to the place where you're trying to modify hash, but it does NOT show who's currently iterating it in parallel (who owns it)! Here's the way to figure that out by printing stack trace for ALL currently running threads. Here's how you do this:
# This solution was found in comment by #thedarkone on https://github.com/rails/rails/issues/24627
rescue Object => boom
thread_count = 0
Thread.list.each do |t|
thread_count += 1
err_msg += "--- thread #{thread_count} of total #{Thread.list.size} #{t.object_id} backtrace begin \n"
# Lets see if we are able to pin down the culprit
# by collecting backtrace for all existing threads:
err_msg += t.backtrace.join("\n")
err_msg += "\n---thread #{thread_count} of total #{Thread.list.size} #{t.object_id} backtrace end \n"
end
# and just print it somewhere you like:
$stderr.puts(err_msg)
raise # always reraise
end
The above code snippet is useful even just for educational purposes as it can show you (like x-ray) how many threads you actually have (versus how many you thought you have - quite often those two are different numbers ;)
You can also use backtrace Ruby gem (I'm the author):
require 'backtrace'
begin
# do something dangerous
rescue StandardError => e
puts Backtrace.new(e)
end

Resources