Ruby if statement to exclude multiple string variations - ruby

I'm trying to parse an array that I've created to ultimately write the 'good' values to a file. The array may look something like this, however the contents may change, so I can't match for a certain value:
array = ["10.10.10.0/24", "10.10.10.1/32", "10.10.10.129/32", "127.0.0.0/8", "169.254.0.0/16", "192.168.1.0/24", "255.255.255.255/32"]
I believe that it makes sense to check the array values before writing to the file and not write the values I know that I don't want. In this case, the values would always be:
10.10.10.1/32
10.10.10.129/32
127.0.0.0/8
169.254.0.0/16
255.255.255.255/32
My initial if statement looked like this, which sort of accomplished what I am after, but not completely:
if !network.include?("/32" || "127.0.0.0/8" || "169.254.0.0/16" || "255.255.255.255/32")
file.write("#{network}\n")
end
Which results in (lines 2 & 3 shouldn't have been included):
10.10.10.0/24
127.0.0.0/8
169.254.0.0/16
192.168.1.0/24
What have I done wrong? Is there a better way to perform the lookup/matching/exclusion?

networks = ["10.10.10.0/24", "10.10.10.1/32", "10.10.10.129/32", "127.0.0.0/8", "169.254.0.0/16", "192.168.1.0/24", "255.255.255.255/32"]
banned_networks = [/\/32/, "127.0.0.0/8", "169.254.0.0/16", "255.255.255.255/32"]
networks.reject do |e|
case e
when *banned_networks
true
end
end.each {|network| file.write("#{network}\n")}

You can't use "or" || like that.
Better might be...
exclude_entries = [ '/32',
'127.0.0.0/8',
'169.254.0.0/16',
'255.255.255.255/32'
]
match_pattern = Regex.new(exclude_entries.join('|'))
(array.reject{|n| n =~ match_pattern}.each do |network|
file.write("#{network}\n")
end
The problem is that the expression "/32" || "127.0.0.0/8" always returns "/32" ... the "or" just returns the first "truthy" value and "/32" is "truthy"
Edited to use regular expression so as to exclude partial text.

Related

Use ruby to remove a part of a string on each entry in an array where it exists

I have a list of file paths, for example
[
'Useful',
'../Some.Root.Directory/Path/Interesting',
'../Some.Root.Directory/Path/Also/Interesting'
]
(I mention that they're file paths in case there is something that makes this task easier because they're files but they can be considered simply a set of strings some of which may start with a particular string)
and I need to make this into a set of pairs so that I have the original list but also
[
'Useful',
'Interesting',
'Also/Interesting'
]
I expected I'd be able to do this
'../Some.Root.Directory/Path/Interesting'.gsub!('../Some.Root.Directory/Path/', '')
or
'../Some.Root.Directory/Path/Interesting'.gsub!('\.\.\/Some\.Root\.Directory\/Path\/', '')
but neither of those replaces the provided string/pattern with an empty string...
So in irb
puts '../Some.Root.Directory/Path/Interesting'.gsub('\.\.\/Some\.Root\.Directory\/Path\/', '')
outputs
../Some.Root.Directory/Path/Interesting
and the desired output is
Interesting
How can I do this?
NB the path will be passed in so really I have
file_path.gsub!(removal_path, '')
If you are positive that strings start with removal_path you can do:
string[removal_path.size..-1]
to get the remaining part.
If you want to get pairs of the original paths and the shortened ones, you can use sub in combination with map:
a = [
'../Some.Root.Directory/Path/Interesting',
'../Some.Root.Directory/Path/Also/Interesting'
]
b = a.map do |v|
[v, v.sub('../Some.Root.Directory/Path', '')]
end
puts b
This will return an Array of arrays - each sub-array contains the original path plus the shortened one. As noted by #sawa - you can simply use sub instead of gsub, since you want to replace only a single occurrence.

Case Statement using || (OR)

The following code I am trying to use to assign an email alias via an api to our ticketing system.
#email.cc_list = case #site_id
when /site1/ || /site2/; "smail-alias-1"
when /site3/ || /site4/ || /site5/ || /site6/; "email-alias-2"
when /site7/ || /site8/; "email-alias-3"
when /site9/; "email-alias-4"
when /site10/; "email-alias-5"
end
the problem is that only site 1, 3, 7, 9, and 10 are actually being assigned properly. anything after the || isn't working.
I would rather avoid 10 when statements for 5 alias's. Is there a way that I can make this case statement work with a hash in order to get the system to determine when it matches the specified site_id? or another way to make the or functions work?
You could write :
#email.cc_list = case #site_id
when /site(1|2)/ then "smail-alias-1"
when /site(3|4|5|6)/ then "email-alias-2"
when /site(7|8)/ then "email-alias-3"
when /site9/ then "email-alias-4"
when /site10/ then "email-alias-5"
end
Not to give you the correct ways since others have done that, but will be good for your knowledge why your original code fails.
case #site_id
when /site1/ || /site2/ ...
does not translate to:
if #site_id =~ /site1/ || #site_id =~ /site2/ ...
but to:
if #site_id =~ /site1/ || /site2/ ...
which is parsed as:
if ((#site_id =~ /site1/) || /site2/) ...
so, when the first match fails, it returns nil. nil ||-ed with a regex object has the value of regex object itself. A regex object in condition has boolean value of... you guess: false
you will even get a:
warning: regex literal in condition
if you do it directly in an if statement.
You might prefer to write:
#email.cc_list = case #site_id[/\d+/].to_i
when 1,2 then "smail-alias-1"
when 3..6 then "email-alias-2"
when 7,8 then "email-alias-3"
when 9 then "email-alias-4"
when 10 then "email-alias-5"
end
You could join your regexes w/ Regexp.union:
#email.cc_list = case #site_id
when Regexp.union(/site1/, /site2/); "smail-alias-1"
when Regexp.union(/site3/, /site4/, /site5/, /site6/); "email-alias-2"
when Regexp.union(/site7/, /site8/); "email-alias-3"
when /site9/; "email-alias-4"
when /site10/; "email-alias-5"
end
Or make the regex like /site1|site2/ yourself which is what union would basically do for you here.

Ruby gsub / regex with several arguments [duplicate]

This question already has answers here:
Match a string against multiple patterns
(2 answers)
Closed 8 years ago.
I'm new to ruby and I'm trying to solve a problem.
I'm parsing through several text field where I want to remove the header which has different values. It works fine when the header always is the same:
variable = variable.gsub(/(^Header_1:$)/, '')
But when I put in several arguments it doesn't work:
variable = variable.gsub(/(^Header_1$)/ || /(^Header_2$)/ || /(^Header_3$)/ || /(^Header_4$)/ || /^:$/, '')
You can use Regexp.union:
regex = Regexp.union(
/^Header_1/,
/^Header_2/,
/^Header_3/,
/^Header_4/,
/^:$/
)
variable.gsub(regex, '')
Please note that ^something$ will not work on strings containing something more than something :)
Cause ^ is for matching beginning of string and $ is for end of string.
So i intentionally removed $.
Also you do not need brackets when you only need to remove the matched string.
You can also use it like this:
headers = %w[Header_1 Header_2 Header_3]
regex = Regexp.union(*headers.map{|s| /^#{s}/}, /^\:$/, /etc/)
variable.gsub(regex, '')
And of course you can remove headers without explicitly define them.
Most likely there are a white space after headers?
If so, you can do it as simple as:
variable = "Header_1 something else"
puts variable.gsub(/(^Header[^\s]*)?(.*)/, '\2')
#=> something else
variable = "Header_BLAH something else"
puts variable.gsub(/(^Header[^\s]*)?(.*)/, '\2')
#=> something else
Just use a proper regexp:
variable.gsub(/^(Header_1|Header_2|Header_3|Header_4|:)$/, '')
If the header is always the same format of Header_n, where n is some integer value, then you can simplify your regex greatly:
/Header_\d+/
will find every one of these:
%w[Header_1 Header_2 Header_3].grep(/Header_\d+/)
[
[0] "Header_1",
[1] "Header_2",
[2] "Header_3"
]
Tweaking it to handle finding words, not substrings:
/^Header_\d+$/
or:
/\bHeader_\d+\b/
As mentioned, using Regexp.union is a good start, but, used blindly, can result in very slow or inefficient patterns, so think ahead and help out the engine by giving it useful sub-patterns to work with:
values = %w[foo bar]
/Header_(?:\d+|#{ values.join('|') })/
=> /Header_(?:\d+|foo|bar)/
Unfortunately, Ruby doesn't have the equivalent to Perl's Regexp::Assemble module, which can build highly optimized patterns from big lists of words. Search here on Stack Overflow for examples of what it can do. For instance:
use Regexp::Assemble;
my #values = ('Header_1', 'Header_2', 'foo', 'bar', 'Header_3');
my $ra = Regexp::Assemble->new;
foreach (#values) {
$ra->add($_);
}
print $ra->re, "\n";
=> (?-xism:(?:Header_[123]|bar|foo))

Ruby Regexp group matching, assign variables on 1 line

I'm currently trying to rexp a string into multiple variables. Example string:
ryan_string = "RyanOnRails: This is a test"
I've matched it with this regexp, with 3 groups:
ryan_group = ryan_string.scan(/(^.*)(:)(.*)/i)
Now to access each group I have to do something like this:
ryan_group[0][0] (first group) RyanOnRails
ryan_group[0][1] (second group) :
ryan_group[0][2] (third group) This is a test
This seems pretty ridiculous and it feels like I'm doing something wrong. I would be expect to be able to do something like this:
g1, g2, g3 = ryan_string.scan(/(^.*)(:)(.*)/i)
Is this possible? Or is there a better way than how I'm doing it?
You don't want scan for this, as it makes little sense. You can use String#match which will return a MatchData object, you can then call #captures to return an Array of captures. Something like this:
#!/usr/bin/env ruby
string = "RyanOnRails: This is a test"
one, two, three = string.match(/(^.*)(:)(.*)/i).captures
p one #=> "RyanOnRails"
p two #=> ":"
p three #=> " This is a test"
Be aware that if no match is found, String#match will return nil, so something like this might work better:
if match = string.match(/(^.*)(:)(.*)/i)
one, two, three = match.captures
end
Although scan does make little sense for this. It does still do the job, you just need to flatten the returned Array first. one, two, three = string.scan(/(^.*)(:)(.*)/i).flatten
You could use Match or =~ instead which would give you a single match and you could either access the match data the same way or just use the special match variables $1, $2, $3
Something like:
if ryan_string =~ /(^.*)(:)(.*)/i
first = $1
third = $3
end
You can name your captured matches
string = "RyanOnRails: This is a test"
/(?<one>^.*)(?<two>:)(?<three>.*)/i =~ string
puts one, two, three
It doesn't work if you reverse the order of string and the regex.
You have to decide whether it is a good idea, but ruby regexp can (automagically) define local variables for you!
I am not yet sure whether this feature is awesome or just totally crazy, but your regex can define local variables.
ryan_string = "RyanOnRails: This is a test"
/^(?<webframework>.*)(?<colon>:)(?<rest>)/ =~ ryan_string
# This defined three variables for you. Crazy, but true.
webframework # => "RyanOnRails"
puts "W: #{webframework} , C: #{colon}, R: #{rest}"
(Take a look at http://ruby-doc.org/core-2.1.1/Regexp.html , search for "local variable").
Note:
As pointed out in a comment, I see that there is a similar and earlier answer to this question by #toonsend (https://stackoverflow.com/a/21412455). I do not think I was "stealing", but if you want to be fair with praises and honor the first answer, feel free :) I hope no animals were harmed.
scan() will find all non-overlapping matches of the regex in your string, so instead of returning an array of your groups like you seem to be expecting, it is returning an array of arrays.
You are probably better off using match(), and then getting the array of captures using MatchData#captures:
g1, g2, g3 = ryan_string.match(/(^.*)(:)(.*)/i).captures
However you could also do this with scan() if you wanted to:
g1, g2, g3 = ryan_string.scan(/(^.*)(:)(.*)/i)[0]

How to write a Ruby switch statement (case...when) with regex and backreferences?

I know that I can write a Ruby case statement to check a match against a regular expressions.
However, I'd like to use the match data in my return statement. Something like this semi-pseudocode:
foo = "10/10/2011"
case foo
when /^([0-9][0-9])/
print "the month is #{match[1]}"
else
print "something else"
end
How can I achieve that?
Thanks!
Just a note: I understand that I wouldn't ever use a switch statement for a simple case as above, but that is only one example. In reality, what I am trying to achieve is the matching of many potential regular expressions for a date that can be written in various ways, and then parsing it with Ruby's Date class accordingly.
The references to the latest regex matching groups are always stored in pseudo variables $1 to $9:
case foo
when /^([0-9][0-9])/
print "the month is #{$1}"
else
print "something else"
end
You can also use the $LAST_MATCH_INFO pseudo variable to get at the whole MatchData object. This can be useful when using named captures:
case foo
when /^(?<number>[0-9][0-9])/
print "the month is #{$LAST_MATCH_INFO['number']}"
else
print "something else"
end
Here's an alternative approach that gets you the same result but doesn't use a switch. If you put your regular expressions in an array, you could do something like this:
res = [ /pat1/, /pat2/, ... ]
m = nil
res.find { |re| m = foo.match(re) }
# Do what you will with `m` now.
Declaring m outside the block allows it to still be available after find is done with the block and find will stop as soon as the block returns a true value so you get the same shortcutting behavior that a switch gives you. This gives you the full MatchData if you need it (perhaps you want to use named capture groups in your regexes) and nicely separates your regexes from your search logic (which may or may not yield clearer code), you could even load your regexes from a config file or choose which set of them you wanted at run time.

Resources