Ruby gsub/regex to find all chars but not specific words - ruby

My Objective:
I have a string like so:
"O_1324||T_6789||EC_67889&&(IC_12345||chicken)||true&&false"
My dream is to use a gsub regex to identify [a-zA-z0-9_] and replace them with something ("false" if you must know). However I don't want to replace the words "true" or "false".
What have I tried
I have been using the super friendly Rubular with little success.
I can get all the "words" (non operators) like so:
(\w+)
I tried matching all the "words" except "true" like so:
(?!true)(\w+)
This did not work. It unmatches only the "t" in true.

You can use following regex :
\b(?:(?!true|false)\b)\w+\b
see demo https://regex101.com/r/eX6rE6/1
Note that you need to use word boundary for matching words. and put the negative look-ahead before \w+ not after!

Related

Can someone help me with Ruby regex to check any word with letters starting with t and ending with r and replace with word Twitter? Thank you

Can someone help me with Ruby regex to check any word with letters starting with t and ending with r and replace with word Twitter? Thank you
I find that Rubular is very useful for working out how regexes work in Ruby.
You have two questions here. First, what regex will recognise what you want. Second, how to replace that found string with something else.
Your regex will be something like /\bt\w*r\b/. The elements here are \b, which is a word boundary. Then, we have the letter t, then any number of word characters \w*, then the letter r, and finally another word boundary \b. (Without the word-boundary characters, your regex will find t...r inside other words, too, so will work on things like 'stress', 'stirs' etc.
To do the replacement you want the gsub method.
new_string = your_string.gsub(/\bt\w*r\b/i, 'Twitter')
This will substitute the string Twitter for the found regex. The i on the end of the regex makes it case-insensitive - omit this if you want it to only find the lower-case text as in the regex.

Regex match all strings between 2 characters

I tried a couple of other links like Regex Match all characters between two strings and Regex get all content between two characters
but they don't seem to fit this use case.
I want to get all the names, potato and tomato. Eg, from | to >.
text = "with <#U0D08NR3|potato> and <#U1698M96|tomato> please"
text.scan((?<=|).*?(?=>)) doesnt seem to work either..
Please guide me regex gods.
You forgot to escape the pipe (|), which is now interpreted as the indicator for an alternation
Your regex with the escaped pipe:
(?<=\|).*?(?=>)
Here you can see the result
Just escape the | : (?<=\|).*?(?=>). Without it, the positive lookbehind means match anything
Try this:
text.scan(/\|(\w*)\>/).flatten
# Returns => ["potato", "tomato"]
Not exactly sure why this works. Something to do with greedy and non-greedy matching. See this

ruby regex match any character besides a specific one

I am looking for a way to match any character besides, for example, a "#."
It would look something like...
gsub(/^foo.*foo$/)
But I'd want it to match
"foofdfdfdfoo"
But not
"fooddgdgd#fdfoo"
Thanks.
^[^#]+$
http://rubular.com/r/glijo99dU9
gsub is for substitution. If you just want to match, the .match method
To expand on Explosion Pills answer, a caret (^) will negate the match in a regex. This means that it will not match if the characters following it are found in the expression. You can read more about it in the documentation.

How to match anything EXCEPT this string?

How can I match a string that is NOT partners?
Here is what I have that matches partners:
/^partners$/i
I've tried the following to NOT match partners but doesn't seem to work:
/^(?!partners)$/i
Your regex
/^(?!partners)$/i
only matches empty lines because you didn't include the end-of-line anchor in your lookahead assertion. Lookaheads do just that - they "look ahead" without actually matching any characters, so only lines that match the regex ^$ will succeed.
This would work:
/^(?!partners$)/i
This reports a match with any string (or, since we're in Ruby here, any line in a multi-line string) that's different from partners. Note that it only matches the empty string at the start of the line. Which is enough for validation purposes, but the match result will be "" (instead of nil which you'd get if the match failed entirely).
not easily but with the look ahead operator it can.
Here the ruby regex
^((?!partners).)*$
Cheers
If you only want to get a true value when string is not partners then there is no need to use regex and you can just use a string comparison (which ignores case).
If you for some reason need a positive regex match for any string which does not contain partners (if it's a part of a larger regex for example) you could use several different constructs, like:
`^(?:(?!partners).)*$`
or
^(?:[^p]+|p(?!artners))*$
For example, in Java:
!"partners".equalsIgnoreCase(aString)

Strip words beginning with a specific letter from a sentence using regex

I'm not sure how to use regular expressions in a function so that I could grab all the words in a sentence starting with a particular letter. I know that I can do:
word =~ /^#{letter}/
to check if the word starts with the letter, but how do I go from word to word. Do I need to convert the string to an array and then iterate through each word or is there a faster way using regex? I'm using ruby so that would look like:
matching_words = Array.new
sentance.split(" ").each do |word|
matching_words.push(word) if word =~ /^#{letter}/
end
Scan may be a good tool for this:
#!/usr/bin/ruby1.8
s = "I think Paris in the spring is a beautiful place"
p s.scan(/\b[it][[:alpha:]]*/i)
# => ["I", "think", "in", "the", "is"]
\b means 'word boundary."
[:alpha:] means upper or lowercase alpha (a-z).
You can use \b. It matches word boundaries--the invisible spot just before and after a word. (You can't see them, but oh they're there!) Here's the regex:
/\b(a\w*)\b/
The \w matches a word character, like letters and digits and stuff like that.
You can see me testing it here: http://rubular.com/regexes/13347
Similar to Anon.'s answer:
/\b(a\w*)/g
and then see all the results with (usually) $n, where n is the n-th hit. Many libraries will return /g results as arrays on the $n-th set of parenthesis, so in this case $1 would return an array of all the matching words. You'll want to double-check with whatever library you're using to figure out how it returns matches like this, there's a lot of variation on global search returns, sadly.
As to the \w vs [a-zA-Z], you can sometimes get faster execution by using the built-in definitions of things like that, as it can easily have an optimized path for the preset character classes.
The /g at the end makes it a "global" search, so it'll find more than one. It's still restricted by line in some languages / libraries, though, so if you wish to check an entire file you'll sometimes need /gm, to make it multi-line
If you want to remove results, like your title (but not question) suggests, try:
/\ba\w*//g
which does a search-and-replace in most languages (/<search>/<replacement>/). Sometimes you need a "s" at the front. Depends on the language / library. In Ruby's case, use:
string.gsub(/(\b)a\w*(\b)/, "\\1\\2")
to retain the non-word characters, and optionally put any replacement text between \1 and \2. gsub for global, sub for the first result.
/\ba[a-z]*\b/i
will match any word starting with 'a'.
The \b indicates a word boundary - we want to only match starting from the beginning of a word, after all.
Then there's the character we want our word to start with.
Then we have as many as possible letter characters, followed by another word boundary.
To match all words starting with t, use:
\bt\w+
That will match test but not footest; \b means "word boundary".
Personally i think that regex is overkill for this application, simply running a select is more than capable of solving this particular problem.
"this is a test".split(' ').select{ |word| word[0,1] == 't' }
result => ["this", "test"]
or if you are determined to use regex then go with grep
"this is a test".split(' ').grep(/^t/)
result => ["this", "test"]
Hope this helps.

Resources