Specify a nil in command line args - ruby

I have a command line tool which generates a flat json doc. Imagine this:
$ ruby gen-json.rb --foo bar --baz qux
{
"foo": "bar"
"baz": "qux"
}
What I want is this to work:
$ ruby gen-json.rb --foo $'\0' --baz qux
{
"foo": null,
"baz": "qux"
}
Instead of null I get an empty string. To simplify the problem even further consider this:
$ cat nil-args.rb
puts "argv[0] is nil" if ARGV[0].nil?
puts "argv[0] is an empty string" if ARGV[0] == ""
puts "argv[1] is nil" if ARGV[1].nil?
I want to run it like this and get this output:
$ ruby nil-args.rb $'\0' foo
argv[0] is nil
But instead I get
argv[0] is an empty string
I suspect this is (arguably) a bug in the ruby interpreter. It is treating argv[0] as a C string which null terminates.

Command line arguments are always strings. You will need to use a sentinel to indicate arguments you want to treat otherwise.

I'm pretty sure you literally cannot do what you are proposing. It's a fundamental limitation of the shell you are using. You can only ever pass string arguments into a script.
It has already been mentioned in a comment but the output you get with the \0 method you tried makes perfect sense. The null terminator technically is an empty string.
Also consider that accessing any element of an array that has not yet been defined will always be nil.
a = [1, 2, 3]
a[10].nil?
#=> true
A possible solution, however, would be for your program to work like this:
$ ruby gen-json.rb --foo --baz qux
{
"foo": null,
"baz": "qux"
}
So when you have a double minus sign argument followed by another double minus sign argument, you infer that the first one was null. You will need to write your own command line option parser to achieve this, though.
Here is a very very simple example script that seems to work (but likely has edge cases and other problems):
require 'pp'
json = {}
ARGV.each_cons(2) do |key, value|
next unless key.start_with? '--'
json[key] = value.start_with?('--') ? nil : value
end
pp json
Would that work for your purposes? :)

Ruby treats EVERYTHING except nil and false as true. Therefore, any empty string will evaluate as true (as well as 0 or 0.0).
You can force the nil behaviour by having something like this:
ARGV.map! do |arg|
if arg.empty?
nil
else
arg
end
end
this way, any empty strings will be transformed in a reference to nil.

You're going to have to decide on a special character to mean nil. I would suggest "\0" (double-quotes ensures the literal string backslash zero is passed in). You can then use a special type converter in OptionParser:
require 'optparse'
require 'json'
values = {}
parser = OptionParser.new do |opts|
opts.on("--foo VAL",:nulldetect) { |val| values[:foo] = val }
opts.on("--bar VAL",:nulldetect) { |val| values[:foo] = val }
opts.accept(:nulldetect) do |string|
if string == '\0'
nil
else
string
end
end
end
parser.parse!
puts values.to_json
Obviously, this requires you to be explicit about the flags you accept, but if you want to accept any flag, you can certainly just hand-jam the if statements for checking for "\0"
As a side note, you can have flags be optional, but you get the empty string passed to the block, e.g.
opts.on("--foo [VAL]") do |val|
# if --foo is passed w/out args, val is empty string
end
So, you'd still need a stand-in

Related

How to pass empty string to command line using OptionParser

I am trying to write a script that get some arguments where some of them might be empty.
It seems that Ruby's OptionParser is not allowing that and throws (OptionParser::InvalidArgument).
Code:
require 'optparse'
options = {}
OptionParser.new do |opt|
opt.on('--might_be_empty might_be_empty', String) { |o| options[:might_be_empty] = o }
end.parse!
puts "might_be_empty: #{options[:might_be_empty]}"
Happy flow:
ruby ./for_stack.rb --might_be_empty "some_real_data"
might_be_empty: some_real_data
When the value is empty:
ruby ./for_stack.rb --might_be_empty ""
./for_stack.rb:10:in `<main>': invalid argument: --might_be_empty (OptionParser::InvalidArgument)
How can I tell the OptionParser to allow empty strings?
Leave coercion type unspecified, or use Object instead of String. Both behave the same.
opt.on('--might_be_empty might_be_empty') { ... }
# ..or
opt.on('--might_be_empty might_be_empty', Object) { ... }
Test:
ruby ./for_stack.rb --might_be_empty "some_real_data"
might_be_empty: some_real_data
ruby ./for_stack.rb --might_be_empty ""
might_be_empty:
According to the docs for OptionParser Type Coercion, passing String isn't just a "do nothing":
String – Any non-empty string
However, if you just leave the Argument pattern off of on (which directs you to the docs for make_switch):
Acceptable option argument format, must be pre-defined with #accept or #accept, or Regexp. This can appear once or assigned as String if not present, otherwise causes an ArgumentError.
While slightly confusing that it's "assigned as String if not present", it's not "assigned as a non-empty String if not present", and it will default to passing you any String, and work as you want it to:
opt.on('--might_be_empty might_be_empty') { |o| options[:might_be_empty] = o }
# is optional
% ruby example.rb
might_be_empty:
# if passed, must have a value
% ruby example.rb --might_be_empty
Traceback (most recent call last):
example.rb:8:in '<main>': missing argument: --might_be_empty (OptionParser::MissingArgument)
# can pass an empty string
% ruby example.rb --might_be_empty ""
might_be_empty:
# can pass any string
% ruby example.rb --might_be_empty "not empty"
might_be_empty: not empty
If you don't want to just leave the argument pattern off, you can create custom conversions, though this seems like overkill to me.
Option Parser allows optional values:
Running this several times:
require 'optparse'
options = {}
OptionParser.new do |opt|
opt.on('--might_be_empty [arg]') { |o| options[:might_be_empty] = o }
end.parse!
puts 'options[:might_be_empty].has_key? is %s' % options.has_key?(:might_be_empty)
puts 'options[:might_be_empty] is "%s"' % options[:might_be_empty]
puts 'options[:might_be_empty] is a %s' % options[:might_be_empty].class
pp ARGV
Shows me:
$ ruby test.rb -m
options[:might_be_empty].has_key? is true
options[:might_be_empty] is ""
options[:might_be_empty] is a NilClass
[]
$ ruby test.rb -m foo
options[:might_be_empty].has_key? is true
options[:might_be_empty] is "foo"
options[:might_be_empty] is a String
[]
$ ruby test.rb -m 1
options[:might_be_empty].has_key? is true
options[:might_be_empty] is "1"
options[:might_be_empty] is a String
[]
This is documented a couple times in the sample code but not explicitly stated in the text:
def perform_inplace_option(parser)
# Specifies an optional option argument
parser.on("-i", "--inplace [EXTENSION]",
"Edit ARGV files in place",
"(make backup if EXTENSION supplied)") do |ext|
self.inplace = true
self.extension = ext || ''
self.extension.sub!(/\A\.?(?=.)/, ".") # Ensure extension begins with dot.
end
end
Also note that you don't have to coerce the returned value because it's already a String. Any values passed in from the command-line are strings as they're snatched from ARGV.

Ruby `downcase!` returns `nil`

With this code:
input = gets.chomp.downcase!
puts input
if there is at least one uppercase letter in the input, the input will be put on screen, freed of its uppercases. But if the input has no uppercase letter, it will put nil, like if nothing was written.
I want my input to be fully downcased; if it is a string with no uppercase letter, it should return the same string.
I thought about something like this:
input = gets.chomp
if input.include(uppercase) then input.downcase! end
But this doesn't work. I hope someone has an idea on how I should do this.
According to the docs for String:
(emphasis is mine added)
downcase
Returns a copy of str with all uppercase letters replaced with their lowercase counterparts. The operation is locale
insensitive—only characters “A” to “Z” are affected. Note: case
replacement is effective only in ASCII region.
downcase!
Downcases the contents of str, returning nil if no changes were made. Note: case replacement is effective only in ASCII
region.
Basically it says that downcase! (with exclamation mark) will return nil if there is no uppercase letters.
To fix your program:
input = gets.chomp.downcase
puts input
Hope that helped!
This will work:
input = gets.chomp.downcase
puts input
String#downcase
Returns a modified string and leaves the original unmodified.
str = "Hello world!"
str.downcase # => "hello world!"
str # => "Hello world!"
String#downcase!
Modifies the original string, returns nil if no changes were made or returns the new string if a change was made.
str = "Hello world!"
str.downcase! # => "hello world!"
str # => "hello world!"
str.downcase! # => nil
! (bang) methods
It's common for Ruby methods with ! / non-! variants to behave in a similar manner. See this post for an in-depth explanation why.
The reason that downcase! returns nil is so you know whether or not the object was changed. If you're assigning the modified string to another variable, like you are here, you should use downcase instead (without the bang !).
If you're not familiar, the standard library bang methods typically act on the receiver directly. That means this:
foo = "Hello"
foo.downcase!
foo #=> "hello"
Versus this:
foo = "Hello"
bar = foo.downcase
foo #=> "Hello"
bar #=> "hello"

Ruby Command Line Implicit Conditional Check

I ran the following from a bash shell:
echo 'hello world' | ruby -ne 'puts $_ if /hello/'
I thought it was a typo at first, but it outputted hello world surprisingly.
I meant to type:
echo 'hello world' | ruby -ne 'puts $_ if /hello/ === $_'
Can anyone give an explanation, or point to documentation, to why we get this implicit comparison to $_?
I'd also like to note:
echo 'hello world' | ruby -ne 'puts $_ if /test/'
Won't output anything.
The Ruby parser has a special case for regular expression literals in conditionals. Normally (i.e. without using the e, n or p command line options) this code:
if /foo/
puts "TRUE!"
end
produces:
$ ruby regex-in-conditional1.rb
regex-in-conditional1.rb:1: warning: regex literal in condition
Assigning something that matches the regex to $_ first, like this:
$_ = 'foo'
if /foo/
puts "TRUE!"
end
produces:
$ ruby regex-in-conditional2.rb
regex-in-conditional2.rb:2: warning: regex literal in condition
TRUE!
This is a (poorly documented) exception to the normal rules for Ruby conditionals, where anything that’s not false or nil evaluates as truthy.
This only applies to regex literals, the following behaves as you might expect for a conditional:
regex = /foo/
if regex
puts "TRUE!"
end
output:
$ ruby regex-in-conditional3.rb
TRUE!
This is handled in the parser. Searching the MRI code for the text of the warning produces a single match in parse.y:
case NODE_DREGX:
case NODE_DREGX_ONCE:
warning_unless_e_option(parser, node, "regex literal in condition");
return NEW_MATCH2(node, NEW_GVAR(rb_intern("$_")));
I don’t know Bison, so I can’t explain exactly what is going on here, but there are some clues you can deduce. The warning_unless_e_option function simply suppresses the warning if the -e option has been set, as this feature is discouraged in normal code but can be useful in expressions from the command line (this explains why you don’t see the warning in your code). The next line seems to be constructing a parse subtree which is a regular expression match between the regex and the $_ global variable, which contains “[t]he last input line of string by gets or readline”. These nodes will then be compiled into the usually regular expression method call.
That shows what is happening, I’ll just finish with a quote from the Kernel#gets documentation which may explain why this is such an obscure feature
The style of programming using $_ as an implicit parameter is gradually losing favor in the Ruby community.
After digging through the Ruby source (MRI), I think I found an explanation.
The code:
pp RubyVM::InstructionSequence.compile('puts "hello world" if /hello/').to_a
produces the following output:
...
[:trace, 1],
[:putobject, /hello/],
[:getspecial, 0, 0],
[:opt_regexpmatch2],
...
The instructions seem to be calling opt_regexpmatch2 with two arguments, the first argument being the regex /hello/ and the second being a return value from getspecial
getspecial can be found in insns.def
/**
#c variable
#e Get value of special local variable ($~, $_, ..).
#j 特殊なローカル変数($~, $_, ...)の値を得る。
*/
DEFINE_INSN
getspecial
(rb_num_t key, rb_num_t type)
()
(VALUE val)
{
val = vm_getspecial(th, GET_LEP(), key, type);
}
Note that our instructions are most likely telling the VM to bring back the value of $_. $_ is automatically set for us when we run ruby with the correct options, e.g., -n
Now that we have our two arguments, we call opt_regexpmatch2
/**
#c optimize
#e optimized regexp match 2
#j 最適化された正規表現マッチ 2
*/
DEFINE_INSN
opt_regexpmatch2
(CALL_INFO ci)
(VALUE obj2, VALUE obj1)
(VALUE val)
{
if (CLASS_OF(obj2) == rb_cString &&
BASIC_OP_UNREDEFINED_P(BOP_MATCH, STRING_REDEFINED_OP_FLAG)) {
val = rb_reg_match(obj1, obj2);
}
else {
PUSH(obj2);
PUSH(obj1);
CALL_SIMPLE_METHOD(obj2);
}
}
At the end of the day
if /hello/' is equivalent to if $_ =~ /hello/ -- $_ will be nil unless we run ruby with the correct options.

Passing all arguments at once to a method

I am trying to read arguments from a text file and the pass them all at once to a Ruby method.
The arguments in the text file are properly formatted e.g.:
"path", ["elem1","elem2"], 4,"string"
I intend to make a function call like this:
my_method("path", ["elem1","elem2"], 4,"string")
This hopefully I am trying to achieve like this:
IO.readlines("path").each do |line|
puts "#{line}"
my_method(*line.split(","))
end
The problem is that in the method all the array elements are wrapped in quotes. So my method ends up getting this:
""path"", "["elem1","elem2"]", "4",""string""
Now, this is probably because its an array of strings, but why wrap it with an additional "" when I say *arr?
If I use eval:
IO.readlines("path").each do |line|
puts "#{line}"
my_method(*eval(line))
end
I end up with syntax error, unexpected ',' after the first argument in "path", ["elem1","elem2"], 4,"string"
How do I achieve passing all the elements to the method at once reading the arguments from a text file
Also since Ruby does not care about types, why do I have to wrap my arguments with "" in the first place. If I don't wrap the argument in a quote, I get undefined variable for main:object error.
I have one solution, but instead of using "," as your delimiter use some other special character as delimiter in the input line.
# Input line in somefile.txt delimited by "||" :
# "path" || ["elem1","elem2"] || 4 || "string"
def my_method(arg1, arg2, arg3, arg4)
path = arg1
arr = arg2.gsub(/([\[\]])/, "").split(",")
number = arg3.to_i
string = arg4
puts "path : #{path} and is #{path.class}"
puts "arr : #{arr} and is #{arr.class}"
puts "number : #{number} and is #{number.class}"
puts "string : #{string} and is #{string.class}"
end
IO.readlines("somefile.txt").each do |line|
my_method(*line.gsub(/[(\\")]/, " ").split("||"))
end
I hope this helped you out. Let me know if you have any problem.
IO.readlines("path").each do |line|
params = line.split(",").each do |param|
param = eval(param)
end
my_method(*params)
end
When you read the line, all params are strings, so to get arrays and integers you might try to eval then first.
the eval tip might be enough to fix your code.
if you pass the param without quotes, the interpreter will understand it as a constant and not as a string. Thats why you get undefined variable. Again, the eval tip should solve this.
OBS: Be careful with eval since it will execute any code, a command to erase the file or even worse (like mess with your computer or server) if the person behind the source of that file knows it.

How could I check to see if a word exists in a string, and return false if it doesn't, in ruby?

Say I have a string str = "Things to do: eat and sleep."
How could I check if "do: " exists in str, case insensitive?
Like this:
puts "yes" if str =~ /do:/i
To return a boolean value (from a method, presumably), compare the result of the match to nil:
def has_do(str)
(str =~ /do:/i) != nil
end
Or, if you don’t like the != nil then you can use !~ instead of =~ and negate the result:
def has_do(str)
not str !~ /do:/i
end
But I don’t really like double negations …
In ruby 1.9 you can do like this:
str.downcase.match("do: ") do
puts "yes"
end
It's not exactly what you asked for, but I noticed a comment to another answer. If you don't mind using regular expressions when matching the string, perhaps there is a way to skip the downcase part to get case insensitivity.
For more info, see String#match
You could also do this:
str.downcase.include? "Some string".downcase
If all I'm looking for is a case=insensitive substring match I usually use:
str.downcase['do: ']
9 times out of 10 I don't care where in the string the match is, so this is nice and concise.
Here's what it looks like in IRB:
>> str = "Things to do: eat and sleep." #=> "Things to do: eat and sleep."
>> str.downcase['do: '] #=> "do: "
>> str.downcase['foobar'] #=> nil
Because it returns nil if there is no hit it works in conditionals too.
"Things to do: eat and sleep.".index(/do: /i)
index returns the position where the match starts, or nil if not found
You can learn more about index method here:
http://ruby-doc.org/core/classes/String.html
Or about regex here:
http://www.regular-expressions.info/ruby.html

Resources