How do I "regex-quote" a string in XPath - xpath

I have an XSL template that takes 2 parameters (text and separator) and calls tokenize($text, $separator).
The problem is, the separator is supposed to be a regex. I have no idea what I get passed as separator string.
In Java I would call Pattern.quote(separator) to get a pattern that matches the exact string, no matter what weird characters it might contain, but I could not find anything like that in XPath.
I could iterate through the string and escape any character that I recognize as a special character with regard to regex.
I could build an iteration with substring-before and do the tokenize that way.
I was wondering if there is an easier, more straightforward way?

You could escape your separator tokens using the XPath replace function to find any character that requires escaping, and precede each occurrence with a \. Then you could pass such an escaped token to the XPath tokenize function.
Alternatively you could just implement your own tokenize stylesheet function, and use the substring-before and substring-after functions to extract substrings, and recursion to process the full string.

Related

Regular expression is not working if json response value contain special character . how to extract the 'sid' value from below mention json

Regular expression is not working if json response value contain special character . how to extract the 'sid' value from below mention json as ""sid"?\s*:?\s*"(\w+)"" is not working as json response contain special character on sid.
0{"sid":"RhANkc9V7-psbnzmJAAGS","upgrades":["websocket"],"pingTimeout":20000,"pingInterval":25000}
Your \w+ meta character matches "word" characters (alphanumeric and underscores, can be also written as [a-zA-Z0-9_] character class)
Dash symbol - won't be captured so you either need to use wildcard characters instead of "word" something like:
sid"?\s*:?\s*"(.*)"
Demo:
Another possible option is removing this starting 0 using JSR223 Post-Processor and the following Groovy code:
prev.setResponseData(prev.getResponseDataAsString().substring(1),'UTF-8')
once done you will be able to use JSON Extractor or JSON JMESPath Extractor
Your matching group currently scoped to capture ONLY one or more word characters, In order to capture special characters along with word characters you have to tweak your regular expression like below,
sid":"(.+?)"
In above regular expression, . matches any character (except for line terminators) whereas \w captures only word characters i.e. a-z, A-Z, 0-9, _
You can also use Boundary Extractor if you don't want to mess with the complexity of Regular Expression.
Left Boundary: sid":"
Right Boundary: ",

How can I check for repeated strings with check-tail plugin in Sensu?

I am using sensu and the check-tail.rb plugin to alert if any errors appear in my app logs. The problem is that I want the check to be successful if it finds 3 or more error messages.
The solution that I came up with is using a regex like:
\^.*"status":503,.*$.*^.*"status":503,.*$.*^.*"status":503,.*$\im
But it seems to not work because of the match function: instead of passing the variable as a ruby regex it passes it as a string (this can be seen here).
You need to pass the pattern as a string literal, not as a Regexp object.
Thus, you need to remove the regex delimiters and change the modifiers to their inline option variants, that is, prepend the pattern with (?im).
(?im)\A.*"status":503,.*$.*^.*"status":503,.*$.*^.*"status":5‌​03,.*\z
Note that to match the start of string in Ruby, you need to use \A and to match the end of string, you need to use \z anchors.

Ruby gsub! with Regex keep word

I want to turn a string line "variable.to_s" into "str(variable)" using gsub! with regex. I currently have
string = "variable.to_s"
string.gsub!(/\w+\.to_s/,/str(\w)/)
which obviously does not work as you cannot use regex in the second part of gsub, but how do I keep the \w found in the gsub part but replacing the .to_s part?
You're capturing the wrong thing:
string.gsub!(/(\w+)\.to_s/, 'str(\1)')
gsub and gsub! take a string or regular expression as the first argument and a string or block as the second argument. You're sending a regular expression to both.
If you need to use a portion of the match in the second part, capture it with brackets. You did this inadvertently in your code but on the wrong side.

Matching braces in ruby with a character in front

I have read quite a few posts here for matching nested braces in Ruby using Regexp. However I cannot adapt it to my situation and I am stuck. The Ruby 1.9 book uses the following to match a set of nested braces
/\A(?<brace_expression>{([^{}]|\g<brace_expression>)*})\Z/x
I am trying to alter this in three ways. 1. I want to use parentheses instead of braces, 2. I want a character in front (such as a hash symbol), and 3. I want to match anywhere in the string, not just beginning and end. Here is what I have so far.
/(#(?<brace_expression>\(([^\(\)]|\g<brace_expression>)*\)))/x
Any help in getting the right expression would be appreciated.
Using the regex modifier x enables comments in the regex. So the # in your regex is interpreted as a comment character and the rest of the regex is ignored. You'll need to either escape the # or remove the x modifier.
Btw: There's no need to escape the parentheses inside [].

Another way instead of escaping regex patterns?

Usually when my regex patterns look like this:
http://www.microsoft.com/
Then i have to escape it like this:
string.match(/http:\/\/www\.microsoft\.com\//)
Is there another way instead of escaping it like that?
I want to be able to just use it like this http://www.microsoft.com, cause I don't want to escape all the special characters in all my patterns.
Regexp.new(Regexp.quote('http://www.microsoft.com/'))
Regexp.quote simply escapes any characters that have special regexp meaning; it takes and returns a string. Note that . is also special. After quoting, you can append to the regexp as needed before passing to the constructor. A simple example:
Regexp.new(Regexp.quote('http://www.microsoft.com/') + '(.*)')
This adds a capturing group for the rest of the path.
You can also use arbitrary delimiters in Ruby for regular expressions by using %r and defining a character before the regular expression, for example:
%r!http://www.microsoft.com/!
Regexp.quote or Regexp.escape can be used to automatically escape things for you:
https://ruby-doc.org/core/Regexp.html#method-c-escape
The result can be passed to Regexp.new to create a Regexp object, and then you can call the object's .match method and pass it the string to match against (the opposite order from string.match(/regex/)).
You can simply use single quotes for escaping.
string.match('http://www.microsoft.com/')
you can also use %q{} if you need single quotes in the text itself. If you need to have variables extrapolated inside the string, then use %Q{}. That's equivalent to double quotes ".
If the string contains regex expressions (eg: .*?()[]^$) that you want extrapolated, use // or %r{}
For convenience I just define
def regexcape(s)
Regexp.new(Regexp.escape(s))
end

Resources