I'm working with ruby with the match method and I want to match an URL that doesn't contain a certain string with a regular Expression:
ex:
http://website1.com/url_with_some_words.html
http://website2.com/url_with_some_other_words.html
http://website3.com/url_with_the_word_dog.html
I want to match the URLs that doesn't contain the word dog, so the 1st and the 2nd ones should be matched
Just use a negative lookahead ^(?!.*dog).*$.
Explanation
^ : match begin of line
(?!.*dog) : negative lookahead, check if the word dog doesn't exist
.* : match everything (except newlines in this case)
$ : match end of line
Online demo
Just use
string !~ /dog/
to select strings you need.
There's actually an incredibly simple way to do this, using select.
array_of_urls.select { |url| !url.match(/dog/) }
this will return an array of the url's that don't contain the word 'dog' anywhere in it.
Another thing you can use is:
!url['dog']
With your example:
array = []
array << 'http://website1.com/url_with_some_words.html'
array << 'http://website2.com/url_with_some_other_words.html'
array << 'http://website3.com/url_with_the_word_dog.html'
array.select { |url| !url['dog'] }
You could also reject the urls that do contain 'dog':
array.reject { |url| url['dog'] }
Related
Is there a way to do this?
I have an array:
["file_1.jar", "file_2.jar","file_3.pom"]
And I want to keep only "file_3.pom", what I want to do is something like this:
array.drop_while{|f| /.pom/.match(f)}
But This way I keep everything in array but "file_3.pom" is there a way to do something like "not_match"?
I found these:
f !~ /.pom/ # => leaves all elements in array
OR
f !~ /*.pom/ # => leaves all elements in array
But none of those returns what I expect.
How about select?
selected = array.select { |f| /.pom/.match(f) }
p selected
# => ["file_3.pom"]
Hope that helps!
In your case you can use the Enumerable#grep method to get an array of the elements that matches a pattern:
["file_1.jar", "file_2.jar", "file_3.pom"].grep(/\.pom\z/)
# => ["file_3.pom"]
As you can see I've also slightly modified your regular expression to actually match only strings that ends with .pom:
\. matches a literal dot, without the \ it matches any character
\z anchor the pattern to the end of the string, without it the pattern would match .pom everywhere in the string.
Since you are searching for a literal string you can also avoid regular expression altogether, for example using the methods String#end_with? and Array#select:
["file_1.jar", "file_2.jar", "file_3.pom"].select { |s| s.end_with?('.pom') }
# => ["file_3.pom"]
If you whant to keep only Strings witch responds on regexp so you can use Ruby method keep_if.
But this methods "destroy" main Array.
a = ["file_1.jar", "file_2.jar","file_3.pom"]
a.keep_if{|file_name| /.pom/.match(file_name)}
p a
# => ["file_3.pom"]
I have a string which looks like:
hello/world/1.9.2-some-text
hello/world/2.0.2-some-text
hello/world/2.11.0
Through regex I want to get the string after last '/' and until end of line i.e. in above examples output should be 1.9.2-some-text, 2.0.2-some-text, 2.11.0
I tried this - ^(.+)\/(.+)$ which returns me an array of which first object is "hello/world" and 2nd object is "1.9.2-some-text"
Is there a way to just get "1.9.2-some-text" as the output?
Try using a negative character class ([^…]) like this:
[^\/]+$
This will match one or more of any character other than / followed by the end of the string.
You can use a negated match here.
'hello/world/1.9.2-some-text'.match(Regexp.new('[^/]+$'))
# => "1.9.2-some-text"
Meaning any character except: / (1 or more times) followed by the end of the string.
Although, the simplest way would be to split the string.
'hello/world/1.9.2-some-text'.split('/').last
# => "1.9.2-some-text"
OR
'hello/world/1.9.2-some-text'.split('/')[-1]
# => "1.9.2-some-text"
If you do not need to use a regex, the ordinary way of doing such thing is:
File.basename("hello/world/1.9.2-some-text")
#=> "1.9.2-some-text"
This is one way:
s = 'hello/world/1.9.2-some-text
hello/world/2.0.2-some-text
hello/world/2.11.0'
s.lines.map { |l| l[/.*\/(.*)/,1] }
#=> ["1.9.2-some-text", "2.0.2-some-text", "2.11.0"]
You said, "in above examples output should be 1.9.2-some-text, 2.0.2-some-text, 2.11.0". That's neither a string nor an array, so I assumed you wanted an array. If you want a string, tack .join(', ') onto the end.
Regex's are naturally "greedy", so .*\/ will match all characters up to and including the last / in each line. 1 returns the contents of the capture group (.*) (capture group 1).
I got a string in Ruby like this:
str = "enum('cpu','hdd','storage','nic','display','optical','floppy','other')"
Now i like to return just a array with only the words (not quotes, thats between the round braces (...). The regex below works, buts includes 'enum' which i don't need.
str.scan(/\w+/)
expected result should be:
{"OPTICAL"=>"optical", "DISPLAY"=>"display", "OTHER"=>"other", "FLOPPY"=>"floppy", "STORAGE"=>"storage", "NIC"=>"nic", "HDD"=>"hdd", "CPU"=>"cpu"}
thanks!
I'd suggest using negative lookahead to eliminate words followed by (:
str.scan(/\w+(?!\w|\()/)
Edit: regex updated, now it also excludes \w, so it won't match word prefixes.
Based on the output you wanted this will work.
str = "enum('cpu','hdd','storage','nic','display','optical','floppy','other')"
arr = str.scan(/'(\w+)'/)
hs = Hash[arr.map { |e| [e.first.upcase,e.first] }]
p hs #=> {"CPU"=>"cpu", "HDD"=>"hdd", "STORAGE"=>"storage", "NIC"=>"nic", "DISPLAY"=>"display", "OPTICAL"=>"optical", "FLOPPY"=>"floppy", "OTHER"=>"other"}
Let's say I have the following array:
arr = ["", "2121", "8", "myString"]
I want to return false in case the array contains any non-digit symbols.
arr.all? { |s| s =~ /^\d+$/ }
This will check for each element if it consists only of digits (\d) – If any of them does not, false will be returned.
Edit: You didn't completely specify if the empty string is valid or not. If it is, the line has to be rewritten as follows (as per DarkDust):
arr.all? {|s| s =~ /^\d*$/ }
If empty strings are allowed:
def contains_non_digit(array)
!array.select {|s| s =~ /^.*[^0-9].*$/}.empty?
end
Explanation: this filters the array for all strings that match a regular expression. This regex is true for a string that contains at least one non-digit character. If the resulting array is empty, the array contains no non-digit strings. Finally, we need to negate the result, because we want to know the array does contain non-digit strings.
If I wanted to remove things like:
.!,'"^-# from an array of strings, how would I go about this while retaining all alphabetical and numeric characters.
Allowed alphabetical characters should also include letters with diacritical marks including à or ç.
You should use a regex with the correct character property. In this case, you can invert the Alnum class (Alphabetic and numeric character):
"◊¡ Marc-André !◊".gsub(/\p{^Alnum}/, '') # => "MarcAndré"
For more complex cases, say you wanted also punctuation, you can also build a set of acceptable characters like:
"◊¡ Marc-André !◊".gsub(/[^\p{Alnum}\p{Punct}]/, '') # => "¡MarcAndré!"
For all character properties, you can refer to the doc.
string.gsub(/[^[:alnum:]]/, "")
The following will work for an array:
z = ['asfdå', 'b12398!', 'c98347']
z.each { |s| s.gsub! /[^[:alnum:]]/, '' }
puts z.inspect
I borrowed Jeremy's suggested regex.
You might consider a regular expression.
http://www.regular-expressions.info/ruby.html
I'm assuming that you're using ruby since you tagged that in your post. You could go through the array, put it through a test using a regexp, and if it passes remove/keep it based on the regexp you use.
A regexp you might use might go something like this:
[^.!,^-#]
That will tell you if its not one of the characters inside the brackets. However, I suggest that you look up regular expressions, you might find a better solution once you know their syntax and usage.
If you truly have an array (as you state) and it is an array of strings (I'm guessing), e.g.
foo = [ "hello", "42 cats!", "yöwza" ]
then I can imagine that you either want to update each string in the array with a new value, or that you want a modified array that only contains certain strings.
If the former (you want to 'clean' every string the array) you could do one of the following:
foo.each{ |s| s.gsub! /\p{^Alnum}/, '' } # Change every string in place…
bar = foo.map{ |s| s.gsub /\p{^Alnum}/, '' } # …or make an array of new strings
#=> [ "hello", "42cats", "yöwza" ]
If the latter (you want to select a subset of the strings where each matches your criteria of holding only alphanumerics) you could use one of these:
# Select only those strings that contain ONLY alphanumerics
bar = foo.select{ |s| s =~ /\A\p{Alnum}+\z/ }
#=> [ "hello", "yöwza" ]
# Shorthand method for the same thing
bar = foo.grep /\A\p{Alnum}+\z/
#=> [ "hello", "yöwza" ]
In Ruby, regular expressions of the form /\A………\z/ require the entire string to match, as \A anchors the regular expression to the start of the string and \z anchors to the end.