Match string that doesn't contain a specific word

Match string that doesn't contain a specific word - ruby

I'm working with ruby with the match method and I want to match an URL that doesn't contain a certain string with a regular Expression:
ex:
http://website1.com/url_with_some_words.html
http://website2.com/url_with_some_other_words.html
http://website3.com/url_with_the_word_dog.html
I want to match the URLs that doesn't contain the word dog, so the 1st and the 2nd ones should be matched

Just use a negative lookahead ^(?!.*dog).*$.
Explanation
^ : match begin of line
(?!.*dog) : negative lookahead, check if the word dog doesn't exist
.* : match everything (except newlines in this case)
$ : match end of line
Online demo

Just use
string !~ /dog/
to select strings you need.

There's actually an incredibly simple way to do this, using select.
array_of_urls.select { |url| !url.match(/dog/) }
this will return an array of the url's that don't contain the word 'dog' anywhere in it.

Another thing you can use is:
!url['dog']
With your example:
array = []
array << 'http://website1.com/url_with_some_words.html'
array << 'http://website2.com/url_with_some_other_words.html'
array << 'http://website3.com/url_with_the_word_dog.html'
array.select { |url| !url['dog'] }
You could also reject the urls that do contain 'dog':
array.reject { |url| url['dog'] }

Related

Drop elements from array if regexp does not match

Is there a way to do this?
I have an array:
["file_1.jar", "file_2.jar","file_3.pom"]
And I want to keep only "file_3.pom", what I want to do is something like this:
array.drop_while{|f| /.pom/.match(f)}
But This way I keep everything in array but "file_3.pom" is there a way to do something like "not_match"?
I found these:
f !~ /.pom/ # => leaves all elements in array
OR
f !~ /*.pom/ # => leaves all elements in array
But none of those returns what I expect.

How about select?
selected = array.select { |f| /.pom/.match(f) }
p selected
# => ["file_3.pom"]
Hope that helps!

In your case you can use the Enumerable#grep method to get an array of the elements that matches a pattern:
["file_1.jar", "file_2.jar", "file_3.pom"].grep(/\.pom\z/)
# => ["file_3.pom"]
As you can see I've also slightly modified your regular expression to actually match only strings that ends with .pom:
\. matches a literal dot, without the \ it matches any character
\z anchor the pattern to the end of the string, without it the pattern would match .pom everywhere in the string.
Since you are searching for a literal string you can also avoid regular expression altogether, for example using the methods String#end_with? and Array#select:
["file_1.jar", "file_2.jar", "file_3.pom"].select { |s| s.end_with?('.pom') }
# => ["file_3.pom"]

If you whant to keep only Strings witch responds on regexp so you can use Ruby method keep_if.
But this methods "destroy" main Array.
a = ["file_1.jar", "file_2.jar","file_3.pom"]
a.keep_if{|file_name| /.pom/.match(file_name)}
p a
# => ["file_3.pom"]

regex for a pattern at end of string

I have a string which looks like:
hello/world/1.9.2-some-text
hello/world/2.0.2-some-text
hello/world/2.11.0
Through regex I want to get the string after last '/' and until end of line i.e. in above examples output should be 1.9.2-some-text, 2.0.2-some-text, 2.11.0
I tried this - ^(.+)\/(.+)$ which returns me an array of which first object is "hello/world" and 2nd object is "1.9.2-some-text"
Is there a way to just get "1.9.2-some-text" as the output?

Try using a negative character class ([^…]) like this:
[^\/]+$
This will match one or more of any character other than / followed by the end of the string.

You can use a negated match here.
'hello/world/1.9.2-some-text'.match(Regexp.new('[^/]+$'))
# => "1.9.2-some-text"
Meaning any character except: / (1 or more times) followed by the end of the string.
Although, the simplest way would be to split the string.
'hello/world/1.9.2-some-text'.split('/').last
# => "1.9.2-some-text"
OR
'hello/world/1.9.2-some-text'.split('/')[-1]
# => "1.9.2-some-text"

If you do not need to use a regex, the ordinary way of doing such thing is:
File.basename("hello/world/1.9.2-some-text")
#=> "1.9.2-some-text"

This is one way:
s = 'hello/world/1.9.2-some-text
hello/world/2.0.2-some-text
hello/world/2.11.0'
s.lines.map { |l| l[/.*\/(.*)/,1] }
#=> ["1.9.2-some-text", "2.0.2-some-text", "2.11.0"]
You said, "in above examples output should be 1.9.2-some-text, 2.0.2-some-text, 2.11.0". That's neither a string nor an array, so I assumed you wanted an array. If you want a string, tack .join(', ') onto the end.
Regex's are naturally "greedy", so .*\/ will match all characters up to and including the last / in each line. 1 returns the contents of the capture group (.*) (capture group 1).

regex get words between braces and quotes (just the words)

I got a string in Ruby like this:
str = "enum('cpu','hdd','storage','nic','display','optical','floppy','other')"
Now i like to return just a array with only the words (not quotes, thats between the round braces (...). The regex below works, buts includes 'enum' which i don't need.
str.scan(/\w+/)
expected result should be:
{"OPTICAL"=>"optical", "DISPLAY"=>"display", "OTHER"=>"other", "FLOPPY"=>"floppy", "STORAGE"=>"storage", "NIC"=>"nic", "HDD"=>"hdd", "CPU"=>"cpu"}
thanks!

I'd suggest using negative lookahead to eliminate words followed by (:
str.scan(/\w+(?!\w|\()/)
Edit: regex updated, now it also excludes \w, so it won't match word prefixes.

Based on the output you wanted this will work.
str = "enum('cpu','hdd','storage','nic','display','optical','floppy','other')"
arr = str.scan(/'(\w+)'/)
hs = Hash[arr.map { |e| [e.first.upcase,e.first] }]
p hs #=> {"CPU"=>"cpu", "HDD"=>"hdd", "STORAGE"=>"storage", "NIC"=>"nic", "DISPLAY"=>"display", "OPTICAL"=>"optical", "FLOPPY"=>"floppy", "OTHER"=>"other"}

arrays in Ruby, how to handle this situation?

Let's say I have the following array:
arr = ["", "2121", "8", "myString"]
I want to return false in case the array contains any non-digit symbols.

arr.all? { |s| s =~ /^\d+$/ }
This will check for each element if it consists only of digits (\d) – If any of them does not, false will be returned.
Edit: You didn't completely specify if the empty string is valid or not. If it is, the line has to be rewritten as follows (as per DarkDust):
arr.all? {|s| s =~ /^\d*$/ }

If empty strings are allowed:
def contains_non_digit(array)
!array.select {|s| s =~ /^.*[^0-9].*$/}.empty?
end
Explanation: this filters the array for all strings that match a regular expression. This regex is true for a string that contains at least one non-digit character. If the resulting array is empty, the array contains no non-digit strings. Finally, we need to negate the result, because we want to know the array does contain non-digit strings.

Remove all non-alphabetical, non-numerical characters from a string?

If I wanted to remove things like:
.!,'"^-# from an array of strings, how would I go about this while retaining all alphabetical and numeric characters.
Allowed alphabetical characters should also include letters with diacritical marks including à or ç.

You should use a regex with the correct character property. In this case, you can invert the Alnum class (Alphabetic and numeric character):
"◊¡ Marc-André !◊".gsub(/\p{^Alnum}/, '') # => "MarcAndré"
For more complex cases, say you wanted also punctuation, you can also build a set of acceptable characters like:
"◊¡ Marc-André !◊".gsub(/[^\p{Alnum}\p{Punct}]/, '') # => "¡MarcAndré!"
For all character properties, you can refer to the doc.

string.gsub(/[^[:alnum:]]/, "")

The following will work for an array:
z = ['asfdå', 'b12398!', 'c98347']
z.each { |s| s.gsub! /[^[:alnum:]]/, '' }
puts z.inspect
I borrowed Jeremy's suggested regex.

You might consider a regular expression.
http://www.regular-expressions.info/ruby.html
I'm assuming that you're using ruby since you tagged that in your post. You could go through the array, put it through a test using a regexp, and if it passes remove/keep it based on the regexp you use.
A regexp you might use might go something like this:
[^.!,^-#]
That will tell you if its not one of the characters inside the brackets. However, I suggest that you look up regular expressions, you might find a better solution once you know their syntax and usage.

If you truly have an array (as you state) and it is an array of strings (I'm guessing), e.g.
foo = [ "hello", "42 cats!", "yöwza" ]
then I can imagine that you either want to update each string in the array with a new value, or that you want a modified array that only contains certain strings.
If the former (you want to 'clean' every string the array) you could do one of the following:
foo.each{ |s| s.gsub! /\p{^Alnum}/, '' } # Change every string in place…
bar = foo.map{ |s| s.gsub /\p{^Alnum}/, '' } # …or make an array of new strings
#=> [ "hello", "42cats", "yöwza" ]
If the latter (you want to select a subset of the strings where each matches your criteria of holding only alphanumerics) you could use one of these:
# Select only those strings that contain ONLY alphanumerics
bar = foo.select{ |s| s =~ /\A\p{Alnum}+\z/ }
#=> [ "hello", "yöwza" ]
# Shorthand method for the same thing
bar = foo.grep /\A\p{Alnum}+\z/
#=> [ "hello", "yöwza" ]
In Ruby, regular expressions of the form /\A………\z/ require the entire string to match, as \A anchors the regular expression to the start of the string and \z anchors to the end.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Match string that doesn't contain a specific word - ruby

Just use a negative lookahead ^(?!.dog).$. Explanation ^ : match begin of line (?!.dog) : negative lookahead, check if the word dog doesn't exist . : match everything (except newlines in this case) $ : match end of line Online demo

Just use string !~ /dog/ to select strings you need.

There's actually an incredibly simple way to do this, using select. array_of_urls.select { |url| !url.match(/dog/) } this will return an array of the url's that don't contain the word 'dog' anywhere in it.

Related

Drop elements from array if regexp does not match

regex for a pattern at end of string

regex get words between braces and quotes (just the words)

arrays in Ruby, how to handle this situation?

Remove all non-alphabetical, non-numerical characters from a string?

Categories

Resources

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Match string that doesn't contain a specific word - ruby

Just use a negative lookahead ^(?!.*dog).*$. Explanation ^ : match begin of line (?!.*dog) : negative lookahead, check if the word dog doesn't exist .* : match everything (except newlines in this case) $ : match end of line Online demo

Just use string !~ /dog/ to select strings you need.

There's actually an incredibly simple way to do this, using select. array_of_urls.select { |url| !url.match(/dog/) } this will return an array of the url's that don't contain the word 'dog' anywhere in it.

Related

Drop elements from array if regexp does not match

regex for a pattern at end of string

regex get words between braces and quotes (just the words)

arrays in Ruby, how to handle this situation?

Remove all non-alphabetical, non-numerical characters from a string?

Categories

Resources

Just use a negative lookahead ^(?!.dog).$. Explanation ^ : match begin of line (?!.dog) : negative lookahead, check if the word dog doesn't exist . : match everything (except newlines in this case) $ : match end of line Online demo