Regex for three expressions with 'AND' - ruby

I need to return true if a string matches three regexes. I have a lot of regex options around each regex pattern. I can use separate match/scan for each of the three values, and conjoin them with AND to see if they all return TRUE. The pipe does not work.
In the code below, I need to get TRUE only for the first mystring3:
mystr3= ' OK 3 values MyServer and myNode and myuser TRUE '
mystr2= 'has on 2 values mynode## and .myserver should be FALSE'
mystr1= ' has on 1 values Myserver should be FALSE'
regex1 = /\bmyserver\b/i ; regex2 = /\bmynode\b/i ; regex3 = /\bmyuser\b/i
regex = /#{regex1}|#{regex2}|#{regex3}/ ## AND /#{regex2}/ and /#{regex3}/
p 'match3 ' + mystr3.scan(regex).to_s
p 'match2 ' + mystr2.scan(regex).to_s
But I think there should be something easier than that.

To check to see that the string matches all three, you can use lookahead for the subexpression three times:
regex = /^(?=.*#{regex1})(?=.*#{regex2})(?=.*#{regex3})/

Related

How to put comma after 3 digits in numeric variable in vbscript?

i want to put a comma after 3 digits in a numeric variable in vbscript
w_orimpo = getvalue(rsmodifica , "w_orimpo")
w_orimpo = FormatNumber(w_orimpo,2)
The initial value of w_orimpo is 21960.
If I use FormatNumber I get the value 21,960.
But I would like to get the following one -> 219,60
We can handle this via a regex replacement:
Dim input, output, regex1, regex2
Set input = "21960"
Set regex1 = New RegExp
Set regex2 = New RegExp
regex1.Pattern = "(\d{3})"
regex1.Global = True
regex2.Pattern = ",$"
output = regex1.Replace(StrReverse(input), "$1,")
output = StrReverse(regex2.Replace(output, ""))
Rhino.Print output
Note that two regex replacements are needed here because VBScript's regex engine does not support lookarounds. There is a single regex pattern which would have gotten the job done here:
(\d{3})(?!$)
This would match (and capture) only groups of three digits at a time, and only if those three digits are not followed by the end of the input. This is needed to cover the following edge case:
123456 -> 123,456
We don't want a comma after the final group of three digits. My answer gets around this problem by doing another regex replacement to trim off any trailing comma.
Or without regex:
Mid(CStr(w_orimpo), 1, 3) & "," & Mid(CStr(w_orimpo), 4)
Or
Dim divider
divider = 10 ^ (Len(CStr(w_orimpo)) - 3)
w_orimpo = FormatNumber(w_orimpo / divider, 2)

Regex to match a specific sequence of strings

Assuming I have 2 array of strings
position1 = ['word1', 'word2', 'word3']
position2 = ['word4', 'word1']
and I want inside a text/string to check if the substring #{target} which exists in text is followed by either one of the words of position1 or following one of the words of the position2 or even both at the same time. Similarly as if I am looking left and right of #{target}.
For example in the sentence "Writing reports and inputting data onto internal systems, with regards to enforcement and immigration papers" if the target word is data I would like to check if the word left (inputting) and right (onto) are included in the arrays or if one of the words in the arrays return true for the regex match. Any suggestions? I am using Ruby and I have tried some regex but I can't make it work yet. I also have to ignore any potential special characters in between.
One of them:
/^.*\b(#{joined_position1})\b.*$[\s,.:-_]*\b#{target}\b[\s,.:-_\\\/]*^.*\b(#{joined_position2})\b.*$/i
Edit:
I figured out this way with regex to capture the word left and right:
(\S+)\s*#{target}\s*(\S+)
However what could I change if I would like to capture more than one words left and right?
If you have two arrays of strings, what you can do is something like this:
matches = /^.+ (\S+) #{target} (\S+) .+$/.match(text)
if matches and (position1.include?(matches[1]) or position2.include?(matches[2]))
do_something()
end
What this regex does is match the target word in your text and extract the words next to it using capture groups. The code then compares those words against your arrays, and does something if they're in the right places. A more general version of this might look like:
def checkWords(target, text, leftArray, rightArray, numLeft = 1, numRight = 1)
# Build the regex
regex = "^.+"
regex += " (\S+)" * numLeft
regex += " #{target}"
regex += " (\S+)" * numRight
regex += " .+$"
pattern = Regexp.new(regex)
matches = pattern.match(text)
return false if !matches
for i in 1..numLeft
return false if (!leftArray.include?(matches[i]))
end
for i in 1..numRight
return false if (!rightArray.include?(matches[numLeft + i]))
end
return true
end
Which can then be invoked like this:
do_something() if checkWords("data", text, position1, position2, 2, 2)
I'm pretty sure it's not terribly idiomatic, but it gives you a general sense of how you would do what you in a more general way.

best way to find substring in ruby using regular expression

I have a string https://stackverflow.com. I want a new string that contains the domain from the given string using regular expressions.
Example:
x = "https://stackverflow.com"
newstring = "stackoverflow.com"
Example 2:
x = "https://www.stackverflow.com"
newstring = "www.stackoverflow.com"
"https://stackverflow.com"[/(?<=:\/\/).*/]
#⇒ "stackverflow.com"
(?<=..) is a positive lookbehind.
If string = "http://stackoverflow.com",
a really easy way is string.split("http://")[1]. But this isn't regex.
A regex solution would be as follows:
string.scan(/^http:\/\/(.+)$/).flatten.first
To explain:
String#scan returns the first match of the regex.
The regex:
^ matches beginning of line
http: matches those characters
\/\/ matches //
(.+) sets a "match group" containing any number of any characters. This is the value returned by the scan.
$ matches end of line
.flatten.first extracts the results from String#scan, which in this case returns a nested array.
You might want to try this:
#!/usr/bin/env ruby
str = "https://stackoverflow.com"
if mtch = str.match(/(?::\/\/)(/S)/)
f1 = mtch.captures
end
There are two capturing groups in the match method: the first one is a non-capturing group referring to your search pattern and the second one referring to everything else afterwards. After that, the captures method will assign the desired result to f1.
I hope this solves your problem.

Incrementing numeric parameter in a URL parameter string?

I've had a look round and can't find what I need on Stack Overflow, and was wondering if someone had a simple solution.
I want to find a parameter within a URL and increment its value, so, as an example:
?kws=&pstc=&cty=&prvnm=1
I want to be able to locate the prvnm parameter no matter where it is in the string and increment its value by 1.
I know I could split the parameters into an array, find the key, increment it and write it back but that seems rather long winded and wondered if someone else had any ideas!
require "uri"
url = "http://example.com/?kws=&pstc=&cty=&prvnm=1"
def new_url(url)
uri = URI.parse(url)
hsh = Hash[URI.decode_www_form(uri.query)]
hsh['prvnm'] = hsh['prvnm'].next
uri.query = URI.encode_www_form(hsh).to_s
uri.to_s
end
new_url(url) # => "http://example.com/?kws=&pstc=&cty=&prvnm=2"
There are already four answers, so I had to come up with something a little different:
s = "?kws=&pstc=&cty=&prvnm=1"
head, sep, tail = s.partition(/(?<=[?&]prvnm=)\d+/)
head + (sep.to_i + 1).to_s + tail # => "?kws=&pstc=&cty=&prvnm=2"
'String#partition' returns an array of three strings [head, sep, tail], such that head + sep + tail => s, where separator is partition's argument, which can be a string or a regex.
We want the separator to be the digits following &prvnm=. We therefore use a regex with \d+ preceeded by the aforementioned string which we want to treat as having zero length, so it will not be included in the separator. That calls for a "positive look-behind": (?<=&prvnm=). \d+ is "greedy", so it take all consequetive digits.
For the given value of s, head, sep, tail = s.partition(/(?<=&prvnm=)(\d+)/)
=> ["?kws=&pstc=&cty=&prvnm=", "1", ""].
Edit: my thanks to #quetzalcoatl for pointing out that I needed to change (?<=&prvnm=) in my regex to what I have now, as what I had would fail when ?prvnm= was at the beginning of the string.
split the string by `&`
then iterate over the parts
then split each part by `=` and inspect the results
when found `prvnm`, parse the integer and increment it
then join the bits by '='
then join the parts by '&'
Or, use regex like:
/[?&]prvnm=\d+/
and parse the result and then do a replacement.
Or, get some URL-parsing library..
Try something like this:
params = "?kws=&pstc=&cty=&prvnm=1"
num = params.scan(/prvnm=(\d)/)[0].join.to_i
puts num + 1
Use:
require 'uri'
Then:
parsed-url= URI.parse( ur full url)
r = CGI.parse(parsed_url.query)
r is now a hash of all your query parameters.
You can easily access it by using:
r["prsvn"].to_i + 1

Checking if a string has balanced parentheses

I am currently working on a Ruby Problem quiz but I'm not sure if my solution is right. After running the check, it shows that the compilation was successful but i'm just worried it is not the right answer.
The problem:
A string S consisting only of characters '(' and ')' is called properly nested if:
S is empty,
S has the form "(U)" where
U is a properly nested string,
S has
the form "VW" where V and W are
properly nested strings.
For example, "(()(())())" is properly nested and "())" isn't.
Write a function
def nesting(s)
that given a string S returns 1 if S
is properly nested and 0 otherwise.
Assume that the length of S does not
exceed 1,000,000. Assume that S
consists only of characters '(' and
')'.
For example, given S = "(()(())())"
the function should return 1 and given
S = "())" the function should return
0, as explained above.
Solution:
def nesting ( s )
# write your code here
if s == '(()(())())' && s.length <= 1000000
return 1
elsif s == ' ' && s.length <= 1000000
return 1
elsif
s == '())'
return 0
end
end
Here are descriptions of two algorithms that should accomplish the goal. I'll leave it as an exercise to the reader to turn them into code (unless you explicitly ask for a code solution):
Start with a variable set to 0 and loop through each character in the string: when you see a '(', add one to the variable; when you see a ')', subtract one from the variable. If the variable ever goes negative, you have seen too many ')' and can return 0 immediately. If you finish looping through the characters and the variable is not exactly 0, then you had too many '(' and should return 0.
Remove every occurrence of '()' in the string (replace with ''). Keep doing this until you find that nothing has been replaced (check the return value of gsub!). If the string is empty, the parentheses were matched. If the string is not empty, it was mismatched.
You're not supposed to just enumerate the given examples. You're supposed to solve the problem generally. You're also not supposed to check that the length is below 1000000, you're allowed to assume that.
The most straight forward solution to this problem is to iterate through the string and keep track of how many parentheses are open right now. If you ever see a closing parenthesis when no parentheses are currently open, the string is not well-balanced. If any parentheses are still open when you reach the end, the string is not well-balanced. Otherwise it is.
Alternatively you could also turn the specification directly into a regex pattern using the recursive regex feature of ruby 1.9 if you were so inclined.
My algorithm would use stacks for this purpose. Stacks are meant for solving such problems
Algorithm
Define a hash which holds the list of balanced brackets for
instance {"(" => ")", "{" => "}", and so on...}
Declare a stack (in our case, array) i.e. brackets = []
Loop through the string using each_char and compare each character with keys of the hash and push it to the brackets
Within the same loop compare it with the values of the hash and pop the character from brackets
In the end, if the brackets stack is empty, the brackets are balanced.
def brackets_balanced?(string)
return false if string.length < 2
brackets_hash = {"(" => ")", "{" => "}", "[" => "]"}
brackets = []
string.each_char do |x|
brackets.push(x) if brackets_hash.keys.include?(x)
brackets.pop if brackets_hash.values.include?(x)
end
return brackets.empty?
end
You can solve this problem theoretically. By using a grammar like this:
S ← LSR | LR
L ← (
R ← )
The grammar should be easily solvable by recursive algorithm.
That would be the most elegant solution. Otherwise as already mentioned here count the open parentheses.
Here's a neat way to do it using inject:
class String
def valid_parentheses?
valid = true
self.gsub(/[^\(\)]/, '').split('').inject(0) do |counter, parenthesis|
counter += (parenthesis == '(' ? 1 : -1)
valid = false if counter < 0
counter
end.zero? && valid
end
end
> "(a+b)".valid_parentheses? # => true
> "(a+b)(".valid_parentheses? # => false
> "(a+b))".valid_parentheses? # => false
> "(a+b))(".valid_parentheses? # => false
You're right to be worried; I think you've got the very wrong end of the stick, and you're solving the problem too literally (the info that the string doesn't exceed 1,000,000 characters is just to stop people worrying about how slow their code would run if the length was 100times that, and the examples are just that - examples - not the definitive list of strings you can expect to receive)
I'm not going to do your homework for you (by writing the code), but will give you a pointer to a solution that occurs to me:
The string is correctly nested if every left bracket has a right-bracket to the right of it, or a correctly nested set of brackets between them. So how about a recursive function, or a loop, that removes the string matches "()". When you run out of matches, what are you left with? Nothing? That was a properly nested string then. Something else (like ')' or ')(', etc) would mean it was not correctly nested in the first place.
Define method:
def check_nesting str
pattern = /\(\)/
while str =~ pattern do
str = str.gsub pattern, ''
end
str.length == 0
end
And test it:
>ruby nest.rb (()(())())
true
>ruby nest.rb (()
false
>ruby nest.rb ((((()))))
true
>ruby nest.rb (()
false
>ruby nest.rb (()(((())))())
true
>ruby nest.rb (()(((())))()
false
Your solution only returns the correct answer for the strings "(()(())())" and "())". You surely need a solution that works for any string!
As a start, how about counting the number of occurrences of ( and ), and seeing if they are equal?

Resources