Nested POSIX regular expression character class in Ruby?

Nested POSIX regular expression character class in Ruby? - ruby

How do I nest a POSIX-style character class inside another character class?
I'm trying to replace the matching of space or dash:
/[\s-]/
with
/[[[:space:]]-]/
And that isn't working. I'm using Ruby 1.9.3 and the official doc has no examples of nesting. I need the POSIX style because I'm working with UTF-8 and my examples are dumbed down from the actual expressions.
Thanks for any help!

Your third set of [] are not needed.
The [:space:] declaration is only valid inside of a set so you will see it appear as [[:space:]] if it is used by itself. In this case, you have more characters so the following will work.
[[:space:]-]

Related

Using a ruby regular expression

I'm completely new to Ruby so I was just wondering if someone could help me out.
I have the following String:
"<planKey><key>OR-J8U</key></planKey>"
What is the regex I have to write to get the center part OR-J8U?

Use the following:
str = "<planKey><key>OR-J8U</key></planKey>"
str[/(?<=\<key\>).*(?=\<\/key\>)/]
#=> "OR-J8U"
This captures anything in between opening and closing 'key' tags using lookahead and lookbehinds

If you want to get the string OR-J8U then you could simply use that string in the regular expression; the - character has to be escaped:
/OR\-J8U/
Though, I believe you want any string that is enclosed within <planKey><key> and </key></planKey>. In that case ice's answer is useful if you allow for an empty string:
/(?<=\<key\>).*(?=\<\/key\>)/
If you don't allow for an empty string, replace the * with +:
/(?<=\<key\>).*(?=\<\/key\>)/
If you prefer a more general approach (any string enclosed within any tags), then I believe the common opinion is not to use a regular expression. Instead consider using an HTML parser. On SO you can find some questions and answers in that regard.

Escape single quote in Xtend template expression

I have a very simple question, but could not figure it out by Google search, please help.
I want to produce this string '\u0000' (note the simple quote marks surrounding it!) using the following simple Xtend method containing a template expression:
def String makeDefaultChar()
{
''''\u0000''''
}
However, this is not accepted as proper syntax (probably because of the four ''''. Is there an escape character for this use case or what is the right syntax?
Thank you in advance!
P.S.
Of course I could use plain Java string like this "'\\u0000'" to achieve the same, but I want to use an Xtend template expression.
My Xtend version is: 2.9.1.v201512180746

There is no "escaping" in template expressions, so you have to use the workaround you mentioned:
'''«"'\\u0000'"»'''
or
'''«"'"»\u0000«"'"»'''
Related discussion: https://groups.google.com/forum/#!topic/xtend-lang/bVZ0nKmQGAI

Single quotes are allowed within Xtend templates as long as they do not occur at the beginning or the end of the template. So a simple workaround is to add an empty expression before/after the single quote:
'''«»'\u0000'«»'''

Regex slightly different in Ruby 2?

I just ported a small gem from Ruby 1.9.3 to the spiffy new Ruby 2.0.0. The only change I had to make was in a regular expression.
Under 1.9.3, the following regex would match any string containing characters other than digits, number-related punctuation, and whitespace (including non-breaking space).
/[^[[:space:]]\d\-,\.]/
Under 2.0.0, I had to move the Posix space class away from the start of the negation class.
/[^\d\-,\.[[:space:]]]/
I haven't found this change mentioned in the patch notes I've reviewed. Is it documented anywhere?

The regular expression engine has been changed to Onigmo (based on Oniguruma) and this might be causing issues.
As far as I can tell, you're declaring the regular expression incorrectly. The second set of brackets is not required:
/[^[:space:]\d\-,\.]/
The [:space:] declaration is only invalid inside of a set so you will see it appear as [[:space:]] if used in isolation. In your case you have several other additions to the set.
I'm not sure why \s would not have sufficed in this case.

Tokenize (lex? parse?) a regular expression

Using Ruby I'd like to take a Regexp object (or a String representing a valid regex; your choice) and tokenize it so that I may manipulate certain parts.
Specifically, I'd like to take a regex/string like this:
regex = /var (\w+) = '([^']+)';/
parts = ["foo","bar"]
and create a replacement string that replaces each capture with a literal from the array:
"var foo = 'bar';"
A naïve regex-based approach to parsing the regex, such as:
i = -1
result = regex.source.gsub(/\([^)]+\)/){ parts[i+=1] }
…would fail for things like nested capture groups, or non-capturing groups, or a regex that had a parenthesis inside a character class. Hence my desire to properly break the regex into semantically-valid pieces.
Is there an existing Regex parser available for Ruby? Is there a (horror of horrors) known regex that cleanly matches regexes? Is there a gem I've not found?
The motivation for this question is a desire to find a clean and simple answer to this question.

I have a JavaScript project on GitHub called: Dynamic (?:Regex Highlighting)++ with Javascript! you may want to look at. It parses PCRE compatible regular expressions written in both free-spacing and non-free-spacing modes. Since the regexes are written in the less-feature-rich JavaScript syntax, these regexes could be easily converted to Ruby.
Note that regular expressions may contain arbitrarily nested parentheses structures and JavaScript has no recursive regex features, so the code must parse the tree of nested parens from the-inside-out. Its a bit tricky but works quite well. Be sure to try it out on the highlighter demo page, where you can input and dynamically highlight any regex. The JavaScript regular expressions used to parse regular expressions are documented here.

Matching braces in ruby with a character in front

I have read quite a few posts here for matching nested braces in Ruby using Regexp. However I cannot adapt it to my situation and I am stuck. The Ruby 1.9 book uses the following to match a set of nested braces
/\A(?<brace_expression>{([^{}]|\g<brace_expression>)*})\Z/x
I am trying to alter this in three ways. 1. I want to use parentheses instead of braces, 2. I want a character in front (such as a hash symbol), and 3. I want to match anywhere in the string, not just beginning and end. Here is what I have so far.
/(#(?<brace_expression>\(([^\(\)]|\g<brace_expression>)*\)))/x
Any help in getting the right expression would be appreciated.

Using the regex modifier x enables comments in the regex. So the # in your regex is interpreted as a comment character and the rest of the regex is ignored. You'll need to either escape the # or remove the x modifier.
Btw: There's no need to escape the parentheses inside [].

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Nested POSIX regular expression character class in Ruby? - ruby

Your third set of [] are not needed. The [:space:] declaration is only valid inside of a set so you will see it appear as [[:space:]] if it is used by itself. In this case, you have more characters so the following will work. [[:space:]-]

Related

Using a ruby regular expression

Escape single quote in Xtend template expression

Regex slightly different in Ruby 2?

Tokenize (lex? parse?) a regular expression

Matching braces in ruby with a character in front

Categories

Resources