How to pass empty string to command line using OptionParser - ruby

I am trying to write a script that get some arguments where some of them might be empty.
It seems that Ruby's OptionParser is not allowing that and throws (OptionParser::InvalidArgument).
Code:
require 'optparse'
options = {}
OptionParser.new do |opt|
opt.on('--might_be_empty might_be_empty', String) { |o| options[:might_be_empty] = o }
end.parse!
puts "might_be_empty: #{options[:might_be_empty]}"
Happy flow:
ruby ./for_stack.rb --might_be_empty "some_real_data"
might_be_empty: some_real_data
When the value is empty:
ruby ./for_stack.rb --might_be_empty ""
./for_stack.rb:10:in `<main>': invalid argument: --might_be_empty (OptionParser::InvalidArgument)
How can I tell the OptionParser to allow empty strings?

Leave coercion type unspecified, or use Object instead of String. Both behave the same.
opt.on('--might_be_empty might_be_empty') { ... }
# ..or
opt.on('--might_be_empty might_be_empty', Object) { ... }
Test:
ruby ./for_stack.rb --might_be_empty "some_real_data"
might_be_empty: some_real_data
ruby ./for_stack.rb --might_be_empty ""
might_be_empty:

According to the docs for OptionParser Type Coercion, passing String isn't just a "do nothing":
String – Any non-empty string
However, if you just leave the Argument pattern off of on (which directs you to the docs for make_switch):
Acceptable option argument format, must be pre-defined with #accept or #accept, or Regexp. This can appear once or assigned as String if not present, otherwise causes an ArgumentError.
While slightly confusing that it's "assigned as String if not present", it's not "assigned as a non-empty String if not present", and it will default to passing you any String, and work as you want it to:
opt.on('--might_be_empty might_be_empty') { |o| options[:might_be_empty] = o }
# is optional
% ruby example.rb
might_be_empty:
# if passed, must have a value
% ruby example.rb --might_be_empty
Traceback (most recent call last):
example.rb:8:in '<main>': missing argument: --might_be_empty (OptionParser::MissingArgument)
# can pass an empty string
% ruby example.rb --might_be_empty ""
might_be_empty:
# can pass any string
% ruby example.rb --might_be_empty "not empty"
might_be_empty: not empty
If you don't want to just leave the argument pattern off, you can create custom conversions, though this seems like overkill to me.

Option Parser allows optional values:
Running this several times:
require 'optparse'
options = {}
OptionParser.new do |opt|
opt.on('--might_be_empty [arg]') { |o| options[:might_be_empty] = o }
end.parse!
puts 'options[:might_be_empty].has_key? is %s' % options.has_key?(:might_be_empty)
puts 'options[:might_be_empty] is "%s"' % options[:might_be_empty]
puts 'options[:might_be_empty] is a %s' % options[:might_be_empty].class
pp ARGV
Shows me:
$ ruby test.rb -m
options[:might_be_empty].has_key? is true
options[:might_be_empty] is ""
options[:might_be_empty] is a NilClass
[]
$ ruby test.rb -m foo
options[:might_be_empty].has_key? is true
options[:might_be_empty] is "foo"
options[:might_be_empty] is a String
[]
$ ruby test.rb -m 1
options[:might_be_empty].has_key? is true
options[:might_be_empty] is "1"
options[:might_be_empty] is a String
[]
This is documented a couple times in the sample code but not explicitly stated in the text:
def perform_inplace_option(parser)
# Specifies an optional option argument
parser.on("-i", "--inplace [EXTENSION]",
"Edit ARGV files in place",
"(make backup if EXTENSION supplied)") do |ext|
self.inplace = true
self.extension = ext || ''
self.extension.sub!(/\A\.?(?=.)/, ".") # Ensure extension begins with dot.
end
end
Also note that you don't have to coerce the returned value because it's already a String. Any values passed in from the command-line are strings as they're snatched from ARGV.

Related

Ruby: What does the comment "frozen_string_literal: true" do?

This is the rspec binstub in my project directory.
#!/usr/bin/env ruby
begin
load File.expand_path("../spring", __FILE__)
rescue LoadError
end
# frozen_string_literal: true
#
# This file was generated by Bundler.
#
# The application 'rspec' is installed as part of a gem, and
# this file is here to facilitate running it.
#
require "pathname"
ENV["BUNDLE_GEMFILE"] ||= File.expand_path("../../Gemfile",
Pathname.new(__FILE__).realpath)
require "rubygems"
require "bundler/setup"
load Gem.bin_path("rspec-core", "rspec")
What is this intended to do?
# frozen_string_literal: true
# frozen_string_literal: true is a magic comment, supported for the first time in Ruby 2.3, that tells Ruby that all string literals in the file are implicitly frozen, as if #freeze had been called on each of them. That is, if a string literal is defined in a file with this comment, and you call a method on that string which modifies it, such as <<, you'll get RuntimeError: can't modify frozen String.
The comment must be on the first line of the file.
In Ruby 2.3, you can use this magic comment to prepare for frozen string literals being the default in Ruby 3.
In Ruby 2.3 run with the --enable=frozen-string-literal flag, and in Ruby 3, string literals are frozen in all files. You can override the global setting with # frozen_string_literal: false.
If you want a string literal to be mutable regardless of the global or per-file setting, you can prefix it with the unary + operator (being careful with operator precedence) or call .dup on it:
# frozen_string_literal: true
"".frozen?
=> true
(+"").frozen?
=> false
"".dup.frozen?
=> false
You can also freeze a mutable (unfrozen) string with unary -.
Source: magic_comment defined in ruby/ruby
It improves application performance by not allocating new space for the same string, thereby also saving time for garbage collection chores. How? when you freeze a string literal(string object), you're telling Ruby to not let any of your programs modify the string literal (object).
Some obvious observations to keep in mind.
1. By freezing string literals, you're not allocating new memory space for it.
Example:
Without magic comment allocates new space for the same string
(Observe the different object IDs printed)
def hello_id
a = 'hello'
a.object_id
end
puts hello_id #=> 70244568358640
puts hello_id #=> 70244568358500
With magic comment, ruby allocates space only once
# frozen_string_literal: true
def hello_id
a = 'hello'
a.object_id
end
puts hello_id #=> 70244568358640
puts hello_id #=> 70244568358640
2. By freezing string literals, your program will raise an exception when trying to modify the string literal.
Example:
Without magic comment, you can modify the string literals.
name = 'Johny'
name << ' Cash'
puts name #=> Johny Cash
With magic comment, an exception will be raised when you modify string literals
# frozen_string_literal: true
name = 'john'
name << ' cash' #=> `<main>': can't modify frozen String (FrozenError)
puts name
There's always more to learn and be flexible:
https://bugs.ruby-lang.org/issues/8976
https://www.mikeperham.com/2018/02/28/ruby-optimization-with-one-magic-comment/
In Ruby 3.0. Matz (Ruby’s creator) decided to make all String literals frozen by default.
EDIT 2019: he decided to abandon the idea of making frozen-string-literals default for Ruby 3.0 (source: https://bugs.ruby-lang.org/issues/11473#note-53)
You can use in Ruby 2.x. Just add this comment in the first line of your files.
# frozen_string_literal: true
The above comment at top of a file changes semantics of static string
literals in the file. The static string literals will be frozen and
always returns same object. (The semantics of dynamic string literals
is not changed.)
This way has following benefits:
No ugly f-suffix.
No syntax error on older Ruby.
We need only a line
for each file.
Plese, read this topic for more information.
https://bugs.ruby-lang.org/issues/8976

Pass regular expression as script argument Ruby

I am trying to pass regular expression to process a file line by line. The regular expression works fine if I hard code it in the code, like this.
File.foreach(filename).with_index do |line, line_num|
md5 = line.scan(/[0-9a-f]{32}/i)
puts md5
end
This works wonderful and I can see every line that has a MD5 hash on it printed. Now, the problem comes when I try to pass the regular expression to match md5 hashes as a script argument like:
ruby md5.rb -h "/[0-9a-f]{32}/i"
options = {}
OptionParser.new do |opts|
opts.on('-h', '--hash "<hash regex>"', 'Hash Regex') { |v| options[:hash] = v }
end.parse!
hash = options[:hash]
File.foreach(filename).with_index do |line, line_num|
md5 = line.scan(hash)
puts md5
end
You can pass the inside bits of the regex as a string, then convert it to a regex later eg:
ruby md5.rb -h "[0-9a-f]{32}"
To convert a string into a regex, just use interpolation:
regex = /#{regex_string}/i

Splitting ARGV into two file lists

I am using Ruby OptionParser but can not figure out how to get non-option arguments as two lists.
myscript --option-one --option-two file1 file2 -- file10 file11
Is there a way to get from OptionParser two lists of files separately?
[file1, file2]
[file10, file11]
I do not care which of them remains in ARGV, just want to have two lists separately to submit them to different processing.
My current solution is
adding a handler of -- as follows
opts.on('--', 'marks the beginning of a different list of files') do
ARGV.unshift(:separator)
end
this produces ARGV with the following content
[ file1, file2, :separator, file10, file11 ]
and then, outside of OptionParser and after parse! was called, I modify ARGV
list1 = ARGV.shift(ARGV.index(:separator))
ARGV.shift
Is there a more elegant way of accomplishing it?
You're not using OptionParser correctly. It has the ability to create arrays/lists for you, but you have to tell it what you want.
You can define two separate options that each take an array, or, you could define one that takes an array and the other comes from ARGV after OptionParser finishes its parse! pass.
require 'optparse'
options = {}
OptionParser.new do |opt|
opt.on('--foo PARM2,PARM2', Array, 'first file list') { |o| options[:foo] = o }
opt.on('--bar PARM2,PARM2', Array, 'second file list') { |o| options[:bar] = o }
end.parse!
puts options
Saving and running that:
ruby test.rb --foo a,b --bar c,d
{:foo=>["a", "b"], :bar=>["c", "d"]}
Or:
require 'optparse'
options = {}
OptionParser.new do |opt|
opt.on('--foo PARM2,PARM2', Array, 'first file list') { |o| options[:foo] = o }
end.parse!
puts options
puts 'ARGV contains: "%s"' % ARGV.join('", "')
Saving and running that:
ruby test.rb --foo a,b c d
{:foo=>["a", "b"]}
ARGV contains: "c", "d"
You don't need to define --. -- is handled by the shell, not the script. This is from man sh:
-- A -- signals the end of options and disables further option processing. Any arguments after the --
are treated as filenames and arguments. An argument of - is equivalent to --.

Specify a nil in command line args

I have a command line tool which generates a flat json doc. Imagine this:
$ ruby gen-json.rb --foo bar --baz qux
{
"foo": "bar"
"baz": "qux"
}
What I want is this to work:
$ ruby gen-json.rb --foo $'\0' --baz qux
{
"foo": null,
"baz": "qux"
}
Instead of null I get an empty string. To simplify the problem even further consider this:
$ cat nil-args.rb
puts "argv[0] is nil" if ARGV[0].nil?
puts "argv[0] is an empty string" if ARGV[0] == ""
puts "argv[1] is nil" if ARGV[1].nil?
I want to run it like this and get this output:
$ ruby nil-args.rb $'\0' foo
argv[0] is nil
But instead I get
argv[0] is an empty string
I suspect this is (arguably) a bug in the ruby interpreter. It is treating argv[0] as a C string which null terminates.
Command line arguments are always strings. You will need to use a sentinel to indicate arguments you want to treat otherwise.
I'm pretty sure you literally cannot do what you are proposing. It's a fundamental limitation of the shell you are using. You can only ever pass string arguments into a script.
It has already been mentioned in a comment but the output you get with the \0 method you tried makes perfect sense. The null terminator technically is an empty string.
Also consider that accessing any element of an array that has not yet been defined will always be nil.
a = [1, 2, 3]
a[10].nil?
#=> true
A possible solution, however, would be for your program to work like this:
$ ruby gen-json.rb --foo --baz qux
{
"foo": null,
"baz": "qux"
}
So when you have a double minus sign argument followed by another double minus sign argument, you infer that the first one was null. You will need to write your own command line option parser to achieve this, though.
Here is a very very simple example script that seems to work (but likely has edge cases and other problems):
require 'pp'
json = {}
ARGV.each_cons(2) do |key, value|
next unless key.start_with? '--'
json[key] = value.start_with?('--') ? nil : value
end
pp json
Would that work for your purposes? :)
Ruby treats EVERYTHING except nil and false as true. Therefore, any empty string will evaluate as true (as well as 0 or 0.0).
You can force the nil behaviour by having something like this:
ARGV.map! do |arg|
if arg.empty?
nil
else
arg
end
end
this way, any empty strings will be transformed in a reference to nil.
You're going to have to decide on a special character to mean nil. I would suggest "\0" (double-quotes ensures the literal string backslash zero is passed in). You can then use a special type converter in OptionParser:
require 'optparse'
require 'json'
values = {}
parser = OptionParser.new do |opts|
opts.on("--foo VAL",:nulldetect) { |val| values[:foo] = val }
opts.on("--bar VAL",:nulldetect) { |val| values[:foo] = val }
opts.accept(:nulldetect) do |string|
if string == '\0'
nil
else
string
end
end
end
parser.parse!
puts values.to_json
Obviously, this requires you to be explicit about the flags you accept, but if you want to accept any flag, you can certainly just hand-jam the if statements for checking for "\0"
As a side note, you can have flags be optional, but you get the empty string passed to the block, e.g.
opts.on("--foo [VAL]") do |val|
# if --foo is passed w/out args, val is empty string
end
So, you'd still need a stand-in

ruby - how to correctly parse varying numbers of command line arguments

n00b question alert!
here is the problem:
I am creating a shell script that takes a minimum of 3 arguments: a string, a line number, and at least one file.
I've written a script that will accept EXACTLY 3 arguments, but I don't know how to handle multiple file name arguments.
here's the relevant parts of my code (skipping the writing back into the file etc):
#!/usr/bin/env ruby
the_string = ARGV[0]
line_number = ARGV[1]
the_file = ARGV[2]
def insert_script(str, line_n, file)
f = file
s = str
ln = line_n.to_i
if (File.file? f)
read_in(f,ln,s)
else
puts "false"
end
end
def read_in(f,ln,s)
lines = File.readlines(f)
lines[ln] = s + "\n"
return lines
end
# run it
puts insert_script(the_string, line_number, the_file)
now I know that it's easy to write a block that will iterate through ALL the arguments:
ARGV.each do |a|
puts a
end
but I need to ONLY loop through the args from ARGV[2] (the first file name) to the last file name.
I know there's got to be - at a minimum - at least one easy way to do this, but I just can't see what it is at the moment!
in any case - I'd be more than happy if someone can just point me to a tutorial or an example, I'm sure there are plenty out there - but I can't seem to find them.
thanks
Would you consider using a helpful gem? Trollop is great for command line parsing because it automatically gives you help messages, long and short command-line switches, etc.
require 'trollop'
opts = Trollop::options do
opt :string, "The string", :type => :string
opt :line, "line number", :type => :int
opt :file, "file(s)", :type => :strings
end
p opts
When I call it "commandline.rb" and run it:
$ ruby commandline.rb --string "foo bar" --line 3 --file foo.txt bar.txt
{:string=>"foo bar", :line=>3, :file=>["foo.txt", "bar.txt"], :help=>false, :string_given=>true, :line_given=>true, :file_given=>true}
If you modify the ARGV array to remove the elements you're no longer interested in treating as filenames, you can treat all remaining elements as filenames and iterate over their contents with ARGF.
That's a mouthful, a small example will demonstrate it more easily:
argf.rb:
#!/usr/bin/ruby
str = ARGV.shift
line = ARGV.shift
ARGF.each do |f|
puts f
end
$ ./argf.rb one two argf.rb argf.rb
#!/usr/bin/ruby
str = ARGV.shift
line = ARGV.shift
ARGF.each do |f|
puts f
end
#!/usr/bin/ruby
str = ARGV.shift
line = ARGV.shift
ARGF.each do |f|
puts f
end
$
There are two copies of the argf.rb file printed to the console because I gave the filename argf.rb twice on the command line. It was opened and iterated over once for each mention.
If you want to operate on the files as files, rather than read their contents, you can simply modify the ARGV array and then use the remaining elements directly.
The canonical way is to use shift, like so:
the_string = ARGV.shift
line_number = ARGV.shift
ARGV.each do |file|
puts insert_script(the_string, line_number, the_file)
end
Take a look at OptionParser - http://ruby-doc.org/stdlib-1.9.3/libdoc/optparse/rdoc/OptionParser.html. It allows you to specify the number of arguments, whether they are mandatory or optional, handle errors such as MissingArgument or InvalidOption.
An alternate (and somewhat uglier) trick if you don't want to use another library or change the ARGV array is to use .upto
2.upto(ARGV.length-1) do |i|
puts ARGV[i]
end

Resources