Ruby - Given string with ENV variables how do I get their values - ruby

Say I have
str = "DISABLE_THINGY=true -p my_profile"
To get the value of DISABLE_THINGY (true) I can use
str.partition("DISABLE_THINGY=").last.split(' ').first
I do not want to do that.
There must be a library that parses all this for me.
Anybody know some better ways?

The selected answer is way too convoluted to solve such a simple problem. Regular expressions are great, but the more complex they are, the more likely they'll be wrong:
str = "DISABLE_THINGY=true -p my_profile"
str[/\w+=(\w+)/, 1] # => "true"
/\w+=(\w+)/ simply looks for "words" joined by =.
See String's [] method for more information.
If you had a number of assignments and wanted to capture them all, or, wanted to capture the name and value of this one:
str = "DISABLE_THINGY=true -p my_profile"
str.scan(/\w+=\w+/).map { |s| s.split('=') } # => [["DISABLE_THINGY", "true"]]
That returns an array-of-arrays, which can be useful, or, you could convert that to a Hash:
str.scan(/\w+=\w+/).map { |s| s.split('=') }.to_h # => {"DISABLE_THINGY"=>"true"}
and similarly:
str = "DISABLE_THINGY=true FOO=bar -p my_profile"
str.scan(/\w+=\w+/).map { |s| s.split('=') } # => [["DISABLE_THINGY", "true"], ["FOO", "bar"]]
str.scan(/\w+=\w+/).map { |s| s.split('=') }.to_h # => {"DISABLE_THINGY"=>"true", "FOO"=>"bar"}

Take a look at "Parse command line arguments in a Ruby script".
If all of your arguments don't use a hyphen, you might have to make slight tweaks to the regex used, but this should get you where you need to go. Just replace ARGV.join(' ') in the accepted answer with your str var.
Adjusted the regex in the link provided above to make your use-case work where you combine ENV variables with command line parameters:
args = Hash[ str.scan(/-{0,2}([^=\s]+)(?:[=\s](\S+))?/) ] => {"DISABLE_THINGY"=>"true", "p"=>"my_profile"}

Related

Ruby extract string via regular expression

I have these strings:
'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
'da_report/GY4LFDN6/2017_11/activily_time2017_11/index.html'
From these two strings, I want to extract these two file names:
'2017_11/view_mission_join_player_count2017_11'
'2017_11/activily_time2017_11'
I wrote some regular expressions, but they seem wrong.
str = 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
str[/([^\/index.html]+)/, 1] # => "a_r"
Regular expression is an overkill here, and i prone to errors.
input = [
"da_report/GY4LFDN6/" \
"2017_11/view_mission_join_player_count2017_11" \
"/index.html",
"da_report/GY4LFDN6/" \
"2017_11/activily_time2017_11" \
"/index.html"
]
input.map { |str| str.split('/')[2..3].join('/') }
#⇒ [
# [0] "2017_11/view_mission_join_player_count2017_11",
# [1] "2017_11/activily_time2017_11"
# ]
or, more elegant:
input.map { |str| str.split('/').grep(/2017_/).join('/') }
Use /(?<=GY4LFDN6\/)(.*)(?=\/index.html)/
str = 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
str[/(?<=GY4LFDN6\/)(.*)(?=\/index.html)/]
=> "2017_11/view_mission_join_player_count2017_11"
live demo: http://rubular.com/r/Ued6UOXWDf
This answer assumes that you want to capture beginning with the third component of the path, up to and including the last component of the path before the filename. If so, then we can use the following regex pattern:
(?:[^/]*/){2}(.*)/.*
The quantity in parentheses is the capture group, i.e. what you want to extract from the entire path.
str = 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
puts str[/(?:[^\/]*\/){2}(.*)\/.*/, 1]
Demo
If you are looking for the values at the end of the string like in the format string/string followed by /filename.extension, you could use a positive lookahead for a file name.
\w+\/\w+(?=\/\w+\.\w+$)
Demo
Based on your examples, you may be able to use a very simple regex.
def extract(str)
str[/\d{4}_\d{2}.+\d{4}_\d{2}/]
end
extract 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
#=> "2017_11/view_mission_join_player_count2017_11"
extract 'da_report/GY4LFDN6/2017_11/activily_time2017_11/index.html'
#=> "2017_11/activily_time2017_11"

Check if local variable is defined given it's name as string in ruby

Can I check if a local variable is defined given it's name as string?
I know there is the function defined?, but you have to give the variable itself.
Example:
a = 'cat'
print defined?(a) # => "cat"
print defined?(b) # => nil
What I need is:
a = 'cat'
print string_defined?("a") # => "cat"
print string_defined?("b") # => nil
Or something like that. I can't find it in the docs...
I tried to use respond_to?, but doesn't seems to work...
Starting with Ruby 2.1.0 you can use Binding#local_variable_defined?:
a = 'cat'
binding.local_variable_defined? 'a' #=> true
binding.local_variable_defined? 'b' #=> false
The following will return true when the local variable in question is (to be) defined in the context, not necessary in a position preceding the point of it:
local_variables.include?("a".to_sym)
#=> true
You can do it using eval:
a = 'cat'
eval("defined?(#{'a'})")
=> "local-variable"
eval("defined?(#{'b'})")
=> nil
Disclaimer: This answer makes use of eval, so it can be dangerous if you don't strictly control the string you want to pass into it. And definitely you shouldn't do it this way if these strings come from user input.

Ruby regex method

I need to get the expected output in ruby by using any method like scan or match.
Input string:
"http://test.com?t&r12=1&r122=1&r1=1&r124=1"
"http://test.com?t&r12=1&r124=1"
Expected:
r12=1,r122=1, r1=1, r124=1
r12=1,r124=1
How can I get the expected output using regex?
Use regex /r\d+=\d+/:
"http://test.com?t&r12=1&r122=1&r1=1&r124=1".scan(/r\d+=\d+/)
# => ["r12=1", "r122=1", "r1=1", "r124=1"]
"http://test.com?t&r12=1&r124=1".scan(/r\d+=\d+/)
# => ["r12=1", "r124=1"]
You can use join to get a string output. Here:
"http://test.com?t&r12=1&r122=1&r1=1&r124=1".scan(/r\d+=\d+/).join(',')
# => "r12=1,r122=1,r1=1,r124=1"
Update
If the URL contains other parameters that may include r in end, the regex can be made stricter:
a = []
"http://test.com?r1=2&r12=1&r122=1&r1=1&r124=1&ar1=2&tr2=3&xy4=5".scan(/(&|\?)(r+\d+=\d+)/) {|x,y| a << y}
a.join(',')
# => "r12=1,r122=1,r1=1,r124=1"
While input strings are urls with queries, I would safeguard myself from the false positives:
input = "http://test.com?t&r12=1&r122=1&r1=1&r124=1"
query_params = input.split('?').last.split('&')
#⇒ ["t", "r12=1", "r122=1", "r1=1", "r124=1"]
r_params = query_params.select { |e| e =~ /\Ar\d+=\d+/ }
#⇒ ["r12=1", "r122=1", "r1=1", "r124=1"]
r_params.join(',')
#⇒ "r12=1,r122=1,r1=1,r124=1"
It’s safer than just scan the original input for any regexp.
If you really need to do it with regex correctly, you'll need to use a regex like this:
puts "http://test.com?t&r12=1&r122=1&r1=1&r124=1".scan(/(?:http.*?\?t|(?<!^)\G)\&*(\br\d*=\d*)(?=.*$)/i).join(',')
puts "http://test.com?t&r12=1&r124=1".scan(/(?:http.*?\?t|(?<!^)\G)\&*(\br\d*=\d*)(?=.*$)/i).join(',')
Sample program output:
r12=1,r122=1,r1=1,r124=1
r12=1,r124=1

resolve #{var} in string

I have loaded a string with #{variable} references in it. How would I resolve those variables in the string like puts does?
name="jim"
str="Hi #{name}"
puts str
Instead of puts, I would like to have the result available to pass as a parameter or save into a variable.
you could eval it
name = "Patrick"
s = 'hello, #{name}'
s # => "hello, \#{name}"
# wrap the string in double quotes, making it a valid interpolatable ruby string
eval "\"#{s}\"" # => "hello, Patrick"
puts doesn't resolve the variables. The Ruby parser does when it creates the string. if you passed str to any other method, it would be the same as passing 'Hi jim', since the interpolation is already done.
String has a format option that appears as %. It can be used to pass arguments into a predefined string much like interpolation does.
message = "Hello, %s"
for_patrick = message % "Patrick" #=> "Hello, Patrick"
for_jessie = message % "Jessie" #=> "Hello, Jessie"
messages = "Hello, %s and %s"
for_p_and_j = messages % ["Patrick", "Jessie"] #=> "Hello, Patrick and Jessie"
It may not look "Rubyish" but I believe it is the functionality you are looking for.
So, if you have a string coming in from somewhere that contains these placeholders, you can then pass in values as arguments as so:
method_that_gets_hello_message % "Patrick"
This will also allow you to only accept values you are expecting.
message = "I can count to %d"
message % "eleven" #=> ArgumentError: invalid value for Integer()
There's a list on Wikipedia for possible placeholders for printf() that should also work in Ruby.
The eval seems to be the only solution for this particular task. But we can avoid this dirty-unsafe-dishonourable eval if we modify the task a bit: we can resolve not local, but instance variable without eval using instance_variable_get:
#name = "Patrick"
#id = 2 # Test that number is ok
#a_b = "oooo" # Test that our regex can eat underscores
s = 'hello, #{name} !!#{id} ??#{a_b}'
s.gsub(/#\{(\w+)\}/) { instance_variable_get '#'+$1 }
=> "hello, Patrick !!2 ??oooo"
In this case you even can use any other characters instead of #{} (for example, %name% etc), by only modifying the regex a bit.
But of course, all this smells.
It sounds like you want the basis for a template system, which Ruby does easily if you use String's gsub or sub methods.
replacements = { '%greeting%' => 'Hello', '%name%' => 'Jim' }
pattern = Regexp.union(replacements.keys)
'%greeting% %name%!'.gsub(pattern, replacements)
=> "Hello Jim!"
You could just as easily define the key as:
replacements = { '#{name}' => 'Jim' }
and use Ruby's normal string interpolation #{...} but I'd recommend not reusing that. Instead use something unique.
The advantage to this is the target => replacement map can easily be put into a YAML file, or a database table, and then you can swap them out with other languages, or different user information. The sky is the limit.
The benefit to this also, is there is no evaluation involved, it's only string substitution. With a bit of creative use you can actually implement macros:
macros = { '%salutation%' => '%greeting% %name%' }
replacements = { '%greeting%' => 'Hello', '%name%' => 'Jim' }
macro_pattern, replacement_pattern = [macros, replacements].map{ |h| Regexp.union(h.keys) }
'%salutation%!'.gsub(macro_pattern, macros).gsub(replacement_pattern, replacements)
=> "Hello Jim!"

Regex with named capture groups getting all matches in Ruby

I have a string:
s="123--abc,123--abc,123--abc"
I tried using Ruby 1.9's new feature "named groups" to fetch all named group info:
/(?<number>\d*)--(?<chars>\s*)/
Is there an API like Python's findall which returns a matchdata collection? In this case I need to return two matches, because 123 and abc repeat twice. Each match data contains of detail of each named capture info so I can use m['number'] to get the match value.
Named captures are suitable only for one matching result.
Ruby's analogue of findall is String#scan. You can either use scan result as an array, or pass a block to it:
irb> s = "123--abc,123--abc,123--abc"
=> "123--abc,123--abc,123--abc"
irb> s.scan(/(\d*)--([a-z]*)/)
=> [["123", "abc"], ["123", "abc"], ["123", "abc"]]
irb> s.scan(/(\d*)--([a-z]*)/) do |number, chars|
irb* p [number,chars]
irb> end
["123", "abc"]
["123", "abc"]
["123", "abc"]
=> "123--abc,123--abc,123--abc"
Chiming in super-late, but here's a simple way of replicating String#scan but getting the matchdata instead:
matches = []
foo.scan(regex){ matches << $~ }
matches now contains the MatchData objects that correspond to scanning the string.
You can extract the used variables from the regexp using names method. So what I did is, I used regular scan method to get the matches, then zipped names and every match to create a Hash.
class String
def scan2(regexp)
names = regexp.names
scan(regexp).collect do |match|
Hash[names.zip(match)]
end
end
end
Usage:
>> "aaa http://www.google.com.tr aaa https://www.yahoo.com.tr ddd".scan2 /(?<url>(?<protocol>https?):\/\/[\S]+)/
=> [{"url"=>"http://www.google.com.tr", "protocol"=>"http"}, {"url"=>"https://www.yahoo.com.tr", "protocol"=>"https"}]
#Nakilon is correct showing scan with a regex, however you don't even need to venture into regex land if you don't want to:
s = "123--abc,123--abc,123--abc"
s.split(',')
#=> ["123--abc", "123--abc", "123--abc"]
s.split(',').inject([]) { |a,s| a << s.split('--'); a }
#=> [["123", "abc"], ["123", "abc"], ["123", "abc"]]
This returns an array of arrays, which is convenient if you have multiple occurrences and need to see/process them all.
s.split(',').inject({}) { |h,s| n,v = s.split('--'); h[n] = v; h }
#=> {"123"=>"abc"}
This returns a hash, which, because the elements have the same key, has only the unique key value. This is good when you have a bunch of duplicate keys but want the unique ones. Its downside occurs if you need the unique values associated with the keys, but that appears to be a different question.
If using ruby >=1.9 and the named captures, you could:
class String
def scan2(regexp2_str, placeholders = {})
return regexp2_str.to_re(placeholders).match(self)
end
def to_re(placeholders = {})
re2 = self.dup
separator = placeholders.delete(:SEPARATOR) || '' #Returns and removes separator if :SEPARATOR is set.
#Search for the pattern placeholders and replace them with the regex
placeholders.each do |placeholder, regex|
re2.sub!(separator + placeholder.to_s + separator, "(?<#{placeholder}>#{regex})")
end
return Regexp.new(re2, Regexp::MULTILINE) #Returns regex using named captures.
end
end
Usage (ruby >=1.9):
> "1234:Kalle".scan2("num4:name", num4:'\d{4}', name:'\w+')
=> #<MatchData "1234:Kalle" num4:"1234" name:"Kalle">
or
> re="num4:name".to_re(num4:'\d{4}', name:'\w+')
=> /(?<num4>\d{4}):(?<name>\w+)/m
> m=re.match("1234:Kalle")
=> #<MatchData "1234:Kalle" num4:"1234" name:"Kalle">
> m[:num4]
=> "1234"
> m[:name]
=> "Kalle"
Using the separator option:
> "1234:Kalle".scan2("#num4#:#name#", SEPARATOR:'#', num4:'\d{4}', name:'\w+')
=> #<MatchData "1234:Kalle" num4:"1234" name:"Kalle">
I needed something similar recently. This should work like String#scan, but return an array of MatchData objects instead.
class String
# This method will return an array of MatchData's rather than the
# array of strings returned by the vanilla `scan`.
def match_all(regex)
match_str = self
match_datas = []
while match_str.length > 0 do
md = match_str.match(regex)
break unless md
match_datas << md
match_str = md.post_match
end
return match_datas
end
end
Running your sample data in the REPL results in the following:
> "123--abc,123--abc,123--abc".match_all(/(?<number>\d*)--(?<chars>[a-z]*)/)
=> [#<MatchData "123--abc" number:"123" chars:"abc">,
#<MatchData "123--abc" number:"123" chars:"abc">,
#<MatchData "123--abc" number:"123" chars:"abc">]
You may also find my test code useful:
describe String do
describe :match_all do
it "it works like scan, but uses MatchData objects instead of arrays and strings" do
mds = "ABC-123, DEF-456, GHI-098".match_all(/(?<word>[A-Z]+)-(?<number>[0-9]+)/)
mds[0][:word].should == "ABC"
mds[0][:number].should == "123"
mds[1][:word].should == "DEF"
mds[1][:number].should == "456"
mds[2][:word].should == "GHI"
mds[2][:number].should == "098"
end
end
end
I really liked #Umut-Utkan's solution, but it didn't quite do what I wanted so I rewrote it a bit (note, the below might not be beautiful code, but it seems to work)
class String
def scan2(regexp)
names = regexp.names
captures = Hash.new
scan(regexp).collect do |match|
nzip = names.zip(match)
nzip.each do |m|
captgrp = m[0].to_sym
captures.add(captgrp, m[1])
end
end
return captures
end
end
Now, if you do
p '12f3g4g5h5h6j7j7j'.scan2(/(?<alpha>[a-zA-Z])(?<digit>[0-9])/)
You get
{:alpha=>["f", "g", "g", "h", "h", "j", "j"], :digit=>["3", "4", "5", "5", "6", "7", "7"]}
(ie. all the alpha characters found in one array, and all the digits found in another array). Depending on your purpose for scanning, this might be useful. Anyway, I love seeing examples of how easy it is to rewrite or extend core Ruby functionality with just a few lines!
A year ago I wanted regular expressions that were more easy to read and named the captures, so I made the following addition to String (should maybe not be there, but it was convenient at the time):
scan2.rb:
class String
#Works as scan but stores the result in a hash indexed by variable/constant names (regexp PLACEHOLDERS) within parantheses.
#Example: Given the (constant) strings BTF, RCVR and SNDR and the regexp /#BTF# (#RCVR#) (#SNDR#)/
#the matches will be returned in a hash like: match[:RCVR] = <the match> and match[:SNDR] = <the match>
#Note: The #STRING_VARIABLE_OR_CONST# syntax has to be used. All occurences of #STRING# will work as #{STRING}
#but is needed for the method to see the names to be used as indices.
def scan2(regexp2_str, mark='#')
regexp = regexp2_str.to_re(mark) #Evaluates the strings. Note: Must be reachable from here!
hash_indices_array = regexp2_str.scan(/\(#{mark}(.*?)#{mark}\)/).flatten #Look for string variable names within (#VAR#) or # replaced by <mark>
match_array = self.scan(regexp)
#Save matches in hash indexed by string variable names:
match_hash = Hash.new
match_array.flatten.each_with_index do |m, i|
match_hash[hash_indices_array[i].to_sym] = m
end
return match_hash
end
def to_re(mark='#')
re = /#{mark}(.*?)#{mark}/
return Regexp.new(self.gsub(re){eval $1}, Regexp::MULTILINE) #Evaluates the strings, creates RE. Note: Variables must be reachable from here!
end
end
Example usage (irb1.9):
> load 'scan2.rb'
> AREA = '\d+'
> PHONE = '\d+'
> NAME = '\w+'
> "1234-567890 Glenn".scan2('(#AREA#)-(#PHONE#) (#NAME#)')
=> {:AREA=>"1234", :PHONE=>"567890", :NAME=>"Glenn"}
Notes:
Of course it would have been more elegant to put the patterns (e.g. AREA, PHONE...) in a hash and add this hash with patterns to the arguments of scan2.
Piggybacking off of Mark Hubbart's answer, I added the following monkey-patch:
class ::Regexp
def match_all(str)
matches = []
str.scan(self) { matches << $~ }
matches
end
end
which can be used as /(?<letter>\w)/.match_all('word'), and returns:
[#<MatchData "w" letter:"w">, #<MatchData "o" letter:"o">, #<MatchData "r" letter:"r">, #<MatchData "d" letter:"d">]
This relies on, as others have said, the use of $~ in the scan block for the match data.
I like the match_all given by John, but I think it has an error.
The line:
match_datas << md
works if there are no captures () in the regex.
This code gives the whole line up to and including the pattern matched/captured by the regex. (The [0] part of MatchData) If the regex has capture (), then this result is probably not what the user (me) wants in the eventual output.
I think in the case where there are captures () in regex, the correct code should be:
match_datas << md[1]
The eventual output of match_datas will be an array of pattern capture matches starting from match_datas[0]. This is not quite what may be expected if a normal MatchData is wanted which includes a match_datas[0] value which is the whole matched substring followed by match_datas[1], match_datas[[2],.. which are the captures (if any) in the regex pattern.
Things are complex - which may be why match_all was not included in native MatchData.

Resources