Do not match a word and then match a character thereafter - expression

I am trying to get a regular expression that matches a string that does not contain a certain word and contains a certain character after the word that was not matched. For instance, it should not match any word starting with 'break' followed by the ';' character, but should match any word that does not start with 'break' but ends with ';'. So in the following example:
break; // does not match
code // does not match
code; // matches
I've tried the following code, but it always matches:
/?!break;/

You can use the regular expression below:
/^(?!(break)).+;$/gi
It catches everything which ends with ";" and avoid words which starts with "break"
I tested with these group of words:
break;
breakline
breakline;
line
line;
code
code;
something
something;
The bold words demonstrates which ones was selected by the regex.
Hope it helps

Related

Regex to select all the commas from string that do not have any white space around them

I want to select all the commas in a string that do not have any white space around. Suppose I have this string:
"He,she, They"
I want to select only the comma between he and she. I tried this in rubular and came up with this regex:
(,[^(,\s)(\s,)])
This selects the comma that I want, but also selects an s which is a character after it.
In your regex (,[^(,\s)(\s,)]) you capture a comma followed by a negated character class that matches not any of the specified characters, which could also be written as (,[^)(,\s]) which will capture for example ,s in a group,
What you could do is use a positive lookahead and a positve lookbehind to check what is on the left and what is on the right is not a \S whitespace character:
(?<=\S),(?=\S)
Regex demo
In Ruby, you may use [[:space:]] to match any (Unicode) whitespace and [^[:space:]] to match any char other than whitespace. Using these character classes inside lookarounds solves the problem:
/(?<=[^[:space:]]),(?=[^[:space:]])/
See the Rubular demo
Here,
(?<=[^[:space:]]) - a positive lookbehind that matches a location that is immediately preceded with a non-whitespace char (if the string start position should also be matched, replace with (?<![[:space:]]))
, - a comma
(?=[^[:space:]]) - a positive lookahead that matches a location that is immediately followed with a non-whitespace char (if the string end position should also be matched, replace with (?![[:space:]])).
Check the regex below and use the code hope it will help you!
re = /[^\s](,)[^\s]/m
str = 'check ,my,domain, qwe,sd'
# Print the match result
str.scan(re) do |match|
puts match.to_s
end
Check LIVE DEMO HERE

Ruby regex | Match enclosing brackets

I'm trying to create a regex pattern to match particular sets of text in my string.
Let's assume this is the string ^foo{bar}#Something_Else
I would like to match ^foo{} skipping entirely the content of the brackets.
Until now i figured out how to get all everything with this regex here \^(\w)\{([^\}]+)} but i really don't know how to ignore the text inside the curly brackets.
Anyone has an idea? Thanks.
Update
This is the final solution:
puts script.gsub(/(\^\w+)\{([^}]+)(})/, '[BEFORE]\2[AFTER]')
Though I'd prefer this with fewer groups:
puts script.gsub(/\^\w+\{([^}]+)}/, '[BEFORE]\1[AFTER]')
Original answer
I need to replace the ^foo{} part with something else
Here is a way to do it with gsub:
s = "^foo{bar}#Something_Else"
puts s.gsub(/(.*)\^\w+\{([^}]+)}(.*)/, '\1SOMETHING ELSE\2\3')
See demo
The technique is the same: you capture the text you want to keep and just match text you want to delete, and use backreferences to restore the text you captured.
The regex matches:
(.*) - matches and captures into Group 2 as much text as possible from the start
\^\w+\{ - matches ^, 1 or more word characters, {
([^}]+) - matches and captures into Group 2 1 or more symbols other than }
} - matches the }
(.*) - and finally match and capture into Group 3 the rest of the string.
If you mean to match ^foo{} by a single match against a regex, it is impossible. A regex match only matches a substring of the original string. Since ^foo{} is not a substring of ^foo{bar}#Something_Else, you cannot match that with a single match.

Ruby Regex gsub - everything after string

I have a string something like:
test:awesome my search term with spaces
And I'd like to extract the string immediately after test: into one variable and everything else into another, so I'd end up with awesome in one variable and my search term with spaces in another.
Logically, what I'd so is move everything matching test:* into another variable, and then remove everything before the first :, leaving me with what I wanted.
At the moment I'm using /test:(.*)([\s]+)/ to match the first part, but I can't seem to get the second part correctly.
The first capture in your regular expression is greedy, and matches spaces because you used .. Instead try:
matches = string.match(/test:(\S*) (.*)/)
# index 0 is the whole pattern that was matched
first = matches[1] # this is the first () group
second = matches[2] # and the second () group
Use the following:
/^test:(.*?) (.*)$/
That is, match "test:", then a series of characters (non-greedily), up to a single space, and another series of characters to the end of the line.
I am guessing you want to remove all the leading spaces before the second match too, hence I have \s+ in the expression. Otherwise, remove the \s+ from the expression, and you'll have what you want:
m = /^test:(\w+)\s+(.*)/.match("test:awesome my search term with spaces")
a = m[1]
b = m[2]
http://codepad.org/JzuNQxBN

Could someone help me parse this string with regex?

I'm not very good with regex, but here's what I got (the string to parse and the regex are on this page) http://rubular.com/r/iIIYDHkwVF
It just needs to match that exact test string
The regular expression is
^"AddonInfo"$(\n\s*)+^\{\s*
It's looking for
^"AddonInfo"$ — a line containing only "AddonInfo"
(\n\s*)+ — followed by at least one newline and possibly many blank or empty lines
^\{\s* — and finally a line beginning with { followed by optional whitespace
To break down a regular expression into its component pieces, have a look at an answer that explains beginning with the basics.
To match the entire string, use
^"AddonInfo"$(\n\s*)+^\{(\s*".+?"\s+".+?"\s*\n)+^\}
So after the open curly, you're looking for one or more lines such that each contains a pair of quote-delimited simple strings (no escaping).
This one works:
^"AddonInfo"[^{]*{[^}]*}
Explanation:
^"AddonInfo" matches "AddonInfo" in the beginning of a line
[^{]* matches all the following non-{ characters
{ matches the following {
[^}]* matches all the following non-} characters
} matches the following }
^"AddonInfo"(\s*)+^\{\s*(?:"([^"]+)"\s+"([^"]*)"\s+)+\}
You will get $1 to point into first key, $2 first value, $3 second key, $4, second value, and so on.
Notice that key is to be non-empty ("([^"]+"), but value may be empty (uses * instead of +).

How to remove the first 4 characters from a string if it matches a pattern in Ruby

I have the following string:
"h3. My Title Goes Here"
I basically want to remove the first four characters from the string so that I just get back:
"My Title Goes Here".
The thing is I am iterating over an array of strings and not all have the h3. part in front so I can't just ditch the first four characters blindly.
I checked the docs and the closest thing I could find was chomp, but that only works for the end of a string.
Right now I am doing this:
"h3. My Title Goes Here".reverse.chomp(" .3h").reverse
This gives me my desired output, but there has to be a better way. I don't want to reverse a string twice for no reason. Is there another method that will work?
To alter the original string, use sub!, e.g.:
my_strings = [ "h3. My Title Goes Here", "No h3. at the start of this line" ]
my_strings.each { |s| s.sub!(/^h3\. /, '') }
To not alter the original and only return the result, remove the exclamation point, i.e. use sub. In the general case you may have regular expressions that you can and want to match more than one instance of, in that case use gsub! and gsub—without the g only the first match is replaced (as you want here, and in any case the ^ can only match once to the start of the string).
You can use sub with a regular expression:
s = 'h3. foo'
s.sub!(/^h[0-9]+\. /, '')
puts s
Output:
foo
The regular expression should be understood as follows:
^ Match from the start of the string.
h A literal "h".
[0-9] A digit from 0-9.
+ One or more of the previous (i.e. one or more digits)
\. A literal period.
A space (yes, spaces are significant by default in regular expressions!)
You can modify the regular expression to suit your needs. See a regular expression tutorial or syntax guide, for example here.
A standard approach would be to use regular expressions:
"h3. My Title Goes Here".gsub /^h3\. /, '' #=> "My Title Goes Here"
gsub means globally substitute and it replaces a pattern by a string, in this case an empty string.
The regular expression is enclosed in / and constitutes of:
^ means beginning of the string
h3 is matched literally, so it means h3
\. - a dot normally means any character so we escape it with a backslash
is matched literally

Resources