How to use variable arguments with ruby's OptionParser - ruby

I don't know ruby very well, but I'm trying to add some functionality to this script a co-worker wrote.
Basically right now it takes a few flags and standard in as input, and it uses OptionParser to parse the flags.
I want to use OptionParser to parse a selection of command line arguments similar to those of cat. So I guess my question is how would I write the command line options parsing part of cat in ruby using OptionParser
cat [OPTION]... [FILE]...
Hope that makes sense, any help is appreciated.

OPTS = {}
op = OptionParser.new do |x|
x.banner = 'cat <options> <file>'
x.separator ''
x.on("-A", "--show-all", "Equivalent to -vET")
{ OPTS[:showall] = true }
x.on("-b", "--number-nonblank", "number nonempty output lines")
{ OPTS[:number_nonblank] = true }
x.on("-x", "--start-from NUM", Integer, "Start numbering from NUM")
{ |n| OPTS[:start_num] = n }
x.on("-h", "--help", "Show this message")
{ puts op; exit }
end
op.parse!(ARGV)
# Example code for dealing with filenames
ARGV.each{ |fn| output_file(OPTS, fn) }
I shall leave other command line operations, as they say, as an exercise for the reader! You get the idea.
(NB: I had to invent a fictional -x parameter to demo passing a value after a flag.)
Update: I should have explained that this will leave ARGV as an array of filenames, assuming that the user has entered any.

Related

Ruby Command Line Implicit Conditional Check

I ran the following from a bash shell:
echo 'hello world' | ruby -ne 'puts $_ if /hello/'
I thought it was a typo at first, but it outputted hello world surprisingly.
I meant to type:
echo 'hello world' | ruby -ne 'puts $_ if /hello/ === $_'
Can anyone give an explanation, or point to documentation, to why we get this implicit comparison to $_?
I'd also like to note:
echo 'hello world' | ruby -ne 'puts $_ if /test/'
Won't output anything.
The Ruby parser has a special case for regular expression literals in conditionals. Normally (i.e. without using the e, n or p command line options) this code:
if /foo/
puts "TRUE!"
end
produces:
$ ruby regex-in-conditional1.rb
regex-in-conditional1.rb:1: warning: regex literal in condition
Assigning something that matches the regex to $_ first, like this:
$_ = 'foo'
if /foo/
puts "TRUE!"
end
produces:
$ ruby regex-in-conditional2.rb
regex-in-conditional2.rb:2: warning: regex literal in condition
TRUE!
This is a (poorly documented) exception to the normal rules for Ruby conditionals, where anything that’s not false or nil evaluates as truthy.
This only applies to regex literals, the following behaves as you might expect for a conditional:
regex = /foo/
if regex
puts "TRUE!"
end
output:
$ ruby regex-in-conditional3.rb
TRUE!
This is handled in the parser. Searching the MRI code for the text of the warning produces a single match in parse.y:
case NODE_DREGX:
case NODE_DREGX_ONCE:
warning_unless_e_option(parser, node, "regex literal in condition");
return NEW_MATCH2(node, NEW_GVAR(rb_intern("$_")));
I don’t know Bison, so I can’t explain exactly what is going on here, but there are some clues you can deduce. The warning_unless_e_option function simply suppresses the warning if the -e option has been set, as this feature is discouraged in normal code but can be useful in expressions from the command line (this explains why you don’t see the warning in your code). The next line seems to be constructing a parse subtree which is a regular expression match between the regex and the $_ global variable, which contains “[t]he last input line of string by gets or readline”. These nodes will then be compiled into the usually regular expression method call.
That shows what is happening, I’ll just finish with a quote from the Kernel#gets documentation which may explain why this is such an obscure feature
The style of programming using $_ as an implicit parameter is gradually losing favor in the Ruby community.
After digging through the Ruby source (MRI), I think I found an explanation.
The code:
pp RubyVM::InstructionSequence.compile('puts "hello world" if /hello/').to_a
produces the following output:
...
[:trace, 1],
[:putobject, /hello/],
[:getspecial, 0, 0],
[:opt_regexpmatch2],
...
The instructions seem to be calling opt_regexpmatch2 with two arguments, the first argument being the regex /hello/ and the second being a return value from getspecial
getspecial can be found in insns.def
/**
#c variable
#e Get value of special local variable ($~, $_, ..).
#j 特殊なローカル変数($~, $_, ...)の値を得る。
*/
DEFINE_INSN
getspecial
(rb_num_t key, rb_num_t type)
()
(VALUE val)
{
val = vm_getspecial(th, GET_LEP(), key, type);
}
Note that our instructions are most likely telling the VM to bring back the value of $_. $_ is automatically set for us when we run ruby with the correct options, e.g., -n
Now that we have our two arguments, we call opt_regexpmatch2
/**
#c optimize
#e optimized regexp match 2
#j 最適化された正規表現マッチ 2
*/
DEFINE_INSN
opt_regexpmatch2
(CALL_INFO ci)
(VALUE obj2, VALUE obj1)
(VALUE val)
{
if (CLASS_OF(obj2) == rb_cString &&
BASIC_OP_UNREDEFINED_P(BOP_MATCH, STRING_REDEFINED_OP_FLAG)) {
val = rb_reg_match(obj1, obj2);
}
else {
PUSH(obj2);
PUSH(obj1);
CALL_SIMPLE_METHOD(obj2);
}
}
At the end of the day
if /hello/' is equivalent to if $_ =~ /hello/ -- $_ will be nil unless we run ruby with the correct options.

Splitting ARGV into two file lists

I am using Ruby OptionParser but can not figure out how to get non-option arguments as two lists.
myscript --option-one --option-two file1 file2 -- file10 file11
Is there a way to get from OptionParser two lists of files separately?
[file1, file2]
[file10, file11]
I do not care which of them remains in ARGV, just want to have two lists separately to submit them to different processing.
My current solution is
adding a handler of -- as follows
opts.on('--', 'marks the beginning of a different list of files') do
ARGV.unshift(:separator)
end
this produces ARGV with the following content
[ file1, file2, :separator, file10, file11 ]
and then, outside of OptionParser and after parse! was called, I modify ARGV
list1 = ARGV.shift(ARGV.index(:separator))
ARGV.shift
Is there a more elegant way of accomplishing it?
You're not using OptionParser correctly. It has the ability to create arrays/lists for you, but you have to tell it what you want.
You can define two separate options that each take an array, or, you could define one that takes an array and the other comes from ARGV after OptionParser finishes its parse! pass.
require 'optparse'
options = {}
OptionParser.new do |opt|
opt.on('--foo PARM2,PARM2', Array, 'first file list') { |o| options[:foo] = o }
opt.on('--bar PARM2,PARM2', Array, 'second file list') { |o| options[:bar] = o }
end.parse!
puts options
Saving and running that:
ruby test.rb --foo a,b --bar c,d
{:foo=>["a", "b"], :bar=>["c", "d"]}
Or:
require 'optparse'
options = {}
OptionParser.new do |opt|
opt.on('--foo PARM2,PARM2', Array, 'first file list') { |o| options[:foo] = o }
end.parse!
puts options
puts 'ARGV contains: "%s"' % ARGV.join('", "')
Saving and running that:
ruby test.rb --foo a,b c d
{:foo=>["a", "b"]}
ARGV contains: "c", "d"
You don't need to define --. -- is handled by the shell, not the script. This is from man sh:
-- A -- signals the end of options and disables further option processing. Any arguments after the --
are treated as filenames and arguments. An argument of - is equivalent to --.

Passing all arguments at once to a method

I am trying to read arguments from a text file and the pass them all at once to a Ruby method.
The arguments in the text file are properly formatted e.g.:
"path", ["elem1","elem2"], 4,"string"
I intend to make a function call like this:
my_method("path", ["elem1","elem2"], 4,"string")
This hopefully I am trying to achieve like this:
IO.readlines("path").each do |line|
puts "#{line}"
my_method(*line.split(","))
end
The problem is that in the method all the array elements are wrapped in quotes. So my method ends up getting this:
""path"", "["elem1","elem2"]", "4",""string""
Now, this is probably because its an array of strings, but why wrap it with an additional "" when I say *arr?
If I use eval:
IO.readlines("path").each do |line|
puts "#{line}"
my_method(*eval(line))
end
I end up with syntax error, unexpected ',' after the first argument in "path", ["elem1","elem2"], 4,"string"
How do I achieve passing all the elements to the method at once reading the arguments from a text file
Also since Ruby does not care about types, why do I have to wrap my arguments with "" in the first place. If I don't wrap the argument in a quote, I get undefined variable for main:object error.
I have one solution, but instead of using "," as your delimiter use some other special character as delimiter in the input line.
# Input line in somefile.txt delimited by "||" :
# "path" || ["elem1","elem2"] || 4 || "string"
def my_method(arg1, arg2, arg3, arg4)
path = arg1
arr = arg2.gsub(/([\[\]])/, "").split(",")
number = arg3.to_i
string = arg4
puts "path : #{path} and is #{path.class}"
puts "arr : #{arr} and is #{arr.class}"
puts "number : #{number} and is #{number.class}"
puts "string : #{string} and is #{string.class}"
end
IO.readlines("somefile.txt").each do |line|
my_method(*line.gsub(/[(\\")]/, " ").split("||"))
end
I hope this helped you out. Let me know if you have any problem.
IO.readlines("path").each do |line|
params = line.split(",").each do |param|
param = eval(param)
end
my_method(*params)
end
When you read the line, all params are strings, so to get arrays and integers you might try to eval then first.
the eval tip might be enough to fix your code.
if you pass the param without quotes, the interpreter will understand it as a constant and not as a string. Thats why you get undefined variable. Again, the eval tip should solve this.
OBS: Be careful with eval since it will execute any code, a command to erase the file or even worse (like mess with your computer or server) if the person behind the source of that file knows it.

ruby - how to correctly parse varying numbers of command line arguments

n00b question alert!
here is the problem:
I am creating a shell script that takes a minimum of 3 arguments: a string, a line number, and at least one file.
I've written a script that will accept EXACTLY 3 arguments, but I don't know how to handle multiple file name arguments.
here's the relevant parts of my code (skipping the writing back into the file etc):
#!/usr/bin/env ruby
the_string = ARGV[0]
line_number = ARGV[1]
the_file = ARGV[2]
def insert_script(str, line_n, file)
f = file
s = str
ln = line_n.to_i
if (File.file? f)
read_in(f,ln,s)
else
puts "false"
end
end
def read_in(f,ln,s)
lines = File.readlines(f)
lines[ln] = s + "\n"
return lines
end
# run it
puts insert_script(the_string, line_number, the_file)
now I know that it's easy to write a block that will iterate through ALL the arguments:
ARGV.each do |a|
puts a
end
but I need to ONLY loop through the args from ARGV[2] (the first file name) to the last file name.
I know there's got to be - at a minimum - at least one easy way to do this, but I just can't see what it is at the moment!
in any case - I'd be more than happy if someone can just point me to a tutorial or an example, I'm sure there are plenty out there - but I can't seem to find them.
thanks
Would you consider using a helpful gem? Trollop is great for command line parsing because it automatically gives you help messages, long and short command-line switches, etc.
require 'trollop'
opts = Trollop::options do
opt :string, "The string", :type => :string
opt :line, "line number", :type => :int
opt :file, "file(s)", :type => :strings
end
p opts
When I call it "commandline.rb" and run it:
$ ruby commandline.rb --string "foo bar" --line 3 --file foo.txt bar.txt
{:string=>"foo bar", :line=>3, :file=>["foo.txt", "bar.txt"], :help=>false, :string_given=>true, :line_given=>true, :file_given=>true}
If you modify the ARGV array to remove the elements you're no longer interested in treating as filenames, you can treat all remaining elements as filenames and iterate over their contents with ARGF.
That's a mouthful, a small example will demonstrate it more easily:
argf.rb:
#!/usr/bin/ruby
str = ARGV.shift
line = ARGV.shift
ARGF.each do |f|
puts f
end
$ ./argf.rb one two argf.rb argf.rb
#!/usr/bin/ruby
str = ARGV.shift
line = ARGV.shift
ARGF.each do |f|
puts f
end
#!/usr/bin/ruby
str = ARGV.shift
line = ARGV.shift
ARGF.each do |f|
puts f
end
$
There are two copies of the argf.rb file printed to the console because I gave the filename argf.rb twice on the command line. It was opened and iterated over once for each mention.
If you want to operate on the files as files, rather than read their contents, you can simply modify the ARGV array and then use the remaining elements directly.
The canonical way is to use shift, like so:
the_string = ARGV.shift
line_number = ARGV.shift
ARGV.each do |file|
puts insert_script(the_string, line_number, the_file)
end
Take a look at OptionParser - http://ruby-doc.org/stdlib-1.9.3/libdoc/optparse/rdoc/OptionParser.html. It allows you to specify the number of arguments, whether they are mandatory or optional, handle errors such as MissingArgument or InvalidOption.
An alternate (and somewhat uglier) trick if you don't want to use another library or change the ARGV array is to use .upto
2.upto(ARGV.length-1) do |i|
puts ARGV[i]
end

Looking for idiomatic way to regex-process a text file in Ruby

I'm looking for idiomatic way to regex-process a text file in Ruby, and here's the best thing I've been able to come up with so far. It removes all " chars:
#!/usr/bin/env ruby
src_name = ARGV[0]
dest_name = ARGV[1]
File.open(src_name, "r+") { |f|
new_lines = f.map { |l|
l = l.gsub(/"/,'')
}
dest_file = File.new(dest_name,"w")
new_lines.each { |l|
dest_file.puts l
}
}
There's got to be something better. For instance:
Why do I have to rewrite the file, shouldn't I be able to do something smarter with pipes?
I'm doing everything line-by-line, it seems like I should be able to address the problem with input and output streams.
eugen's answer is awesome. Here is the same thing as a "normal" script.
#!/usr/bin/env ruby
STDOUT << STDIN.read.gsub(/"/,'')
If you're going for simple replacing like that, you can do it at command line like that:
ruby -e '$_.gsub!(/"/,"")' -i.bak -p INPUT_FILE.txt
It runs whatever you pass as the argument to the -e flag, replaces the content of the INPUT_FILE.txt with the result and just for safety saves a copy of the original with the .bak extension.

Resources