I have a string like this
"base: [_0x3e63[241], _0x3e63[242]],
gray: [_0x3e63[243], _0x3e63[244], _0x3e63[245], _0x3e63[246], _0x3e63[247], _0x3e63[248], _0x3e63[249], _0x3e63[250], _0x3e63[251], _0x3e63[252]],
red: [_0x3e63[253], _0x3e63[254], _0x3e63[255], _0x3e63[256], _0x3e63[257], _0x3e63[258], _0x3e63[259], _0x3e63[260], _0x3e63[261], _0x3e63[262]],
pink: [_0x3e63[263], _0x3e63[264], _0x3e63[265], _0x3e63[266], _0x3e63[267], _0x3e63[268], _0x3e63[269], _0x3e63[270], _0x3e63[271], _0x3e63[272]],
grape: [_0x3e63[273], _0x3e63[274], _0x3e63[275], _0x3e63[276], _0x3e63[277], _0x3e63[278], _0x3e63[279], _0x3e63[280], _0x3e63[281], _0x3e63[282]],
violet: [_0x3e63[283], _0x3e63[284], _0x3e63[285], _0x3e63[286], _0x3e63[287], _0x3e63[288], _0x3e63[289], _0x3e63[290], _0x3e63[291], _0x3e63[292]],
indigo: [_0x3e63[293], _0x3e63[294], _0x3e63[295], _0x3e63[296], _0x3e63[297], _0x3e63[298], _0x3e63[299], _0x3e63[300], _0x3e63[301], _0x3e63[302]],
blue: [_0x3e63[303], _0x3e63[304], _0x3e63[305], _0x3e63[306], _0x3e63[307], _0x3e63[308], _0x3e63[309], _0x3e63[310], _0x3e63[311], _0x3e63[312]],
cyan: [_0x3e63[313], _0x3e63[314], _0x3e63[315], _0x3e63[316], _0x3e63[317], _0x3e63[318], _0x3e63[319], _0x3e63[320], _0x3e63[321], _0x3e63[322]],
teal: [_0x3e63[323], _0x3e63[324], _0x3e63[325], _0x3e63[326], _0x3e63[327], _0x3e63[328], _0x3e63[329], _0x3e63[330], _0x3e63[331], _0x3e63[332]],
green: [_0x3e63[333], _0x3e63[334], _0x3e63[335], _0x3e63[336], _0x3e63[337], _0x3e63[338], _0x3e63[339], _0x3e63[340], _0x3e63[341], _0x3e63[342]],
lime: [_0x3e63[343], _0x3e63[344], _0x3e63[345], _0x3e63[346], _0x3e63[347], _0x3e63[348], _0x3e63[349], _0x3e63[350], _0x3e63[351], _0x3e63[352]],
yellow: [_0x3e63[353], _0x3e63[354], _0x3e63[355], _0x3e63[356], _0x3e63[357], _0x3e63[358], _0x3e63[359], _0x3e63[360], _0x3e63[361], _0x3e63[362]],
orange: [_0x3e63[363], _0x3e63[364], _0x3e63[365], _0x3e63[366], _0x3e63[367], _0x3e63[368], _0x3e63[369], _0x3e63[370], _0x3e63[371], _0x3e63[372]]"
_0x3e63 is a ruby array with the values.
_0x3e63 = ["#f783ac", "#faa2c1", "#fcc2d7", "#ffdeeb", "#fff0f6", "#862e9c", "#9c36b5", "#ae3ec9", "#be4bdb", "#cc5de8", "#da77f2", "#e599f7", "#eebefa", "#f3d9fa", "#f8f0fc", "#5f3dc4", "#6741d9", "#7048e8", "#7950f2", "#845ef7", "#9775fa", "#b197fc", "#d0bfff", "#e5dbff", "#f3f0ff", "#364fc7", "#3b5bdb", "#4263eb", "#4c6ef5", "#5c7cfa", "#748ffc", "#91a7ff", "#bac8ff", "#dbe4ff", "#edf2ff", "#1864ab", "#1971c2", "#1c7ed6", "#228be6", "#339af0", "#4dabf7", "#74c0fc", "#a5d8ff", "#d0ebff", "#e7f5ff", "#0b7285", "#0c8599", "#1098ad", "#15aabf", "#22b8cf", "#3bc9db", "#66d9e8", "#99e9f2", "#c5f6fa", "#e3fafc", "#087f5b", "#099268", "#0ca678", "#12b886", "#20c997", "#38d9a9", "#63e6be", "#96f2d7", "#c3fae8", "#e6fcf5", "#2b8a3e", "#2f9e44", "#37b24d", "#40c057", "#51cf66", "#69db7c", "#8ce99a", "#b2f2bb", "#d3f9d8", "#ebfbee", "#5c940d", "#66a80f", "#74b816", "#82c91e", "#94d82d", "#a9e34b", "#c0eb75", "#d8f5a2", "#e9fac8", "#f4fce3", "#e67700", "#f08c00", "#f59f00", "#fab005", "#fcc419", "#ffd43b", "#ffe066", "#ffec99", "#fff3bf", "#fff9db", "#d9480f", "#e8590c"]
I cannot find a way to retrieve from the string _0x3e63[xxxxxxx] replacing it with the right value....
Use String#gsub with a block.
Assuming your input string is stored in the variable input, the following code does the replacement and displays the result:
puts input.gsub(/_0x3e63\[(\d+)\]/){|s| _0x3e63[$1.to_i]}
(The array _0x3e63 you posted in the question does not contain enough values to have indices like 247 or 251 but the code works nevertheless.)
The code is very simple. The regular expression /_0x3e63\[(\d+)\]/ matches any string that starts with _0x3e63[, continues with one or more digits (\d+) and ends with ].
For each match the block is executed and the value returned by the block is used to replace the matched piece of the original string.
The replacement uses $1 (that contains the sub-string that matches the first capturing group) as an index into the array _0x3e63. Because the value of $1 is a string, .to_i is used to convert it to a number (required to be used as index in the array).
We are given:
str =<<~END
base: [arr[6], arr[3]],
gray: [arr[0], arr[4], arr[1], arr[5]],
red: [arr[2]]
END
#=> "base: [arr[6], arr[3]],\ngray: [arr[0], arr[4], arr[1], arr[5]],\nred: [arr[2]]\n"
and
arr = ["#f783ac", "#faa2c1", "#fcc2d7", "#ffdeeb", "#fff0f6", "#862e9c",
"#9c36b5"]
We can perform the required replacements by using String#gsub with a regular expression and Kernel#eval:
puts str.gsub(/\barr\[\d+\]/) { |s| eval s }
base: [#9c36b5, #ffdeeb],
gray: [#f783ac, #fff0f6, #faa2c1, #862e9c],
red: [#fcc2d7]
The regular expression preforms the following operations:
\b # match a word break (to avoid matching 'gnarr')
arr\[ # match string 'arr['
\d+ # match 1+ digits
\] # match ']'
Rubular
One must be cautious about using eval (to avoid launching missiles inadvertently, for example), but as long as the matches of the string can be trusted it's a perfectly safe and useful method.
I have a sentences like this:
Hello #[Pratha](user:1), did you see #[John](user:3)'s answer?
And what I want to is get #[Pratha](user:1) and #[John](user:3). Either their names and ids or just as texts as I quoted so that i can explode and parse name and id myself.
But there is an issue here. Names Pratha and John may include non-abc characters like ', ,, -, + , etc... But not [] and ()
What I tried so far:
c = ''
f = c.match(/(?:\s|^)(?:#(?!(?:\d+|\w+?_|_\w+?)(?:\s(\[)|$)))(\w+)(?=\s|$)/i)
But no success.
You may use
/#\[([^\]\[]*)\]\([^()]*:(\d+)\)/
See the regex demo
Details
# - a # char
\[ - a [
([^\]\[]*) - Group 1: 0+ chars other than [ and ]
\] - a ] char
\( - a ( char
[^()]*- 0+ chars other than ( and )
: - a colon
(\d+) - Group 2: 1 or more digits
\) - a ) char.
Sample Ruby code:
s = "Hello #[Pratha](user:1), did you see #[John](user:3)'s answer?"
rx = /#\[([^\]\[]*)\]\([^()]*:(\d+)\)/
res = s.scan(rx)
puts res
# = > [["Pratha", "1"], ["John", "3"]]
"Hello #[Pratha](user:1), did you see #[John](user:3)'s answer?".scan(/#.*?\)/)
#⇒ ["#[Pratha](user:1)", "#[John](user:3)"]
Since the line is not coming from the user input, you might rely on that the part you are interested in starts with # and ends with ).
You could use 2 capturing groups to get the names and the id's:
#\[([^]]+)]\([^:]+:([^)]+)\)
That will match
# Match literally
\[ Match [
([^]]+) 1st capturing group which matches not ] 1+ times using a negated character class.
\( Match literally
[^:]+: Match not :, then match :
([^)]+) 2nd capturing group which matches not ) 1+ times
\) Match )
Regex demo | Ruby demo
I have a regex expression that I'm using to find all the words in a given block of content, case insensitive, that are contained in a glossary stored in a database. Here's my pattern:
/($word)/i
The problem is, if I use /(Foo)/i then words like Food get matched. There needs to be whitespace or a word boundary on both sides of the word.
How can I modify my expression to match only the word Foo when it is a word at the beginning, middle, or end of a sentence?
Use word boundaries:
/\b($word)\b/i
Or if you're searching for "S.P.E.C.T.R.E." like in Sinan Ünür's example:
/(?:\W|^)(\Q$word\E)(?:\W|$)/i
To match any whole word you would use the pattern (\w+)
Assuming you are using PCRE or something similar:
Above screenshot taken from this live example: http://regex101.com/r/cU5lC2
Matching any whole word on the commandline with (\w+)
I'll be using the phpsh interactive shell on Ubuntu 12.10 to demonstrate the PCRE regex engine through the method known as preg_match
Start phpsh, put some content into a variable, match on word.
el#apollo:~/foo$ phpsh
php> $content1 = 'badger'
php> $content2 = '1234'
php> $content3 = '$%^&'
php> echo preg_match('(\w+)', $content1);
1
php> echo preg_match('(\w+)', $content2);
1
php> echo preg_match('(\w+)', $content3);
0
The preg_match method used the PCRE engine within the PHP language to analyze variables: $content1, $content2 and $content3 with the (\w)+ pattern.
$content1 and $content2 contain at least one word, $content3 does not.
Match a number of literal words on the commandline with (dart|fart)
el#apollo:~/foo$ phpsh
php> $gun1 = 'dart gun';
php> $gun2 = 'fart gun';
php> $gun3 = 'farty gun';
php> $gun4 = 'unicorn gun';
php> echo preg_match('(dart|fart)', $gun1);
1
php> echo preg_match('(dart|fart)', $gun2);
1
php> echo preg_match('(dart|fart)', $gun3);
1
php> echo preg_match('(dart|fart)', $gun4);
0
variables gun1 and gun2 contain the string dart or fart. gun4 does not. However it may be a problem that looking for word fart matches farty. To fix this, enforce word boundaries in regex.
Match literal words on the commandline with word boundaries.
el#apollo:~/foo$ phpsh
php> $gun1 = 'dart gun';
php> $gun2 = 'fart gun';
php> $gun3 = 'farty gun';
php> $gun4 = 'unicorn gun';
php> echo preg_match('(\bdart\b|\bfart\b)', $gun1);
1
php> echo preg_match('(\bdart\b|\bfart\b)', $gun2);
1
php> echo preg_match('(\bdart\b|\bfart\b)', $gun3);
0
php> echo preg_match('(\bdart\b|\bfart\b)', $gun4);
0
So it's the same as the previous example except that the word fart with a \b word boundary does not exist in the content: farty.
Using \b can yield surprising results. You would be better off figuring out what separates a word from its definition and incorporating that information into your pattern.
#!/usr/bin/perl
use strict; use warnings;
use re 'debug';
my $str = 'S.P.E.C.T.R.E. (Special Executive for Counter-intelligence,
Terrorism, Revenge and Extortion) is a fictional global terrorist
organisation';
my $word = 'S.P.E.C.T.R.E.';
if ( $str =~ /\b(\Q$word\E)\b/ ) {
print $1, "\n";
}
Output:
Compiling REx "\b(S\.P\.E\.C\.T\.R\.E\.)\b"
Final program:
1: BOUND (2)
2: OPEN1 (4)
4: EXACT (9)
9: CLOSE1 (11)
11: BOUND (12)
12: END (0)
anchored "S.P.E.C.T.R.E." at 0 (checking anchored) stclass BOUND minlen 14
Guessing start of match in sv for REx "\b(S\.P\.E\.C\.T\.R\.E\.)\b" against "S.P
.E.C.T.R.E. (Special Executive for Counter-intelligence,"...
Found anchored substr "S.P.E.C.T.R.E." at offset 0...
start_shift: 0 check_at: 0 s: 0 endpos: 1
Does not contradict STCLASS...
Guessed: match at offset 0
Matching REx "\b(S\.P\.E\.C\.T\.R\.E\.)\b" against "S.P.E.C.T.R.E. (Special Exec
utive for Counter-intelligence,"...
0 | 1:BOUND(2)
0 | 2:OPEN1(4)
0 | 4:EXACT (9)
14 | 9:CLOSE1(11)
14 | 11:BOUND(12)
failed...
Match failed
Freeing REx: "\b(S\.P\.E\.C\.T\.R\.E\.)\b"
For Those who want to validate an Enum in their code you can following the guide
In Regex World you can use ^ for starting a string and $ to end it. Using them in combination with | could be what you want :
^(Male)$|^(Female)$
It will return true only for Male or Female case.
If you are doing it in Notepad++
[\w]+
Would give you the entire word, and you can add parenthesis to get it as a group. Example: conv1 = Conv2D(64, (3, 3), activation=LeakyReLU(alpha=a), padding='valid', kernel_initializer='he_normal')(inputs). I would like to move LeakyReLU into its own line as a comment, and replace the current activation. In notepad++ this can be done using the follow find command:
([\w]+)( = .+)(LeakyReLU.alpha=a.)(.+)
and the replace command becomes:
\1\2'relu'\4 \n # \1 = LeakyReLU\(alpha=a\)\(\1\)
The spaces is to keep the right formatting in my code. :)
use word boundaries \b,
The following (using four escapes) works in my environment: Mac, safari Version 10.0.3 (12602.4.8)
var myReg = new RegExp(‘\\\\b’+ variable + ‘\\\\b’, ‘g’)
Get all "words" in a string
/([^\s]+)/g
Basically ^/s means break on spaces (or match groups of non-spaces)
Don't forget the g for Greedy
Try it:
"Not the answer you're looking for? Browse other questions tagged regex word-boundary or ask your own question.".match(/([^\s]+)/g)
→ (17) ['Not', 'the', 'answer', "you're", 'looking', 'for?', 'Browse', 'other', 'questions', 'tagged', 'regex', 'word-boundary', 'or', 'ask', 'your', 'own', 'question.']
I working with some regular expression matching and I'm trying to figure out how you would exclude a specific character pattern. Specifically, I want to exclude the following pattern:
5 - #in words: digit, space, dash & space)
I know how to exclude the components individually: [^5 ^-] but I'm looking to exclude the specific pattern. Is this possible?
Update - I'm using Ruby as my programming language.
Here is some sample input and desired output.:
Input: 1 - Blue-Stork Stables; 2 - Young, Robert, S.; 3 - Seahorse Stable; 4 - Carney, Elvis; 5 - Guerrero, Juan, Carlos-Martin; 6 - Dubb, Michael; 7 - Summers, Hope; 8 - DTH Stables; 9 - Peebles, Matthew\n
the desired output would be:
Output: Blue-Stork Stables; Young, Robert, S.; Seahorse Stable; Carney, Elvis; Guerrero, Juan, Carlos-Marting; Dubb, Michael; Summers, Hope; DTH Stables; Peebles, Matthew\n
Please take note of the dashes on Blue-Stork Stables and Juan Carlos-Martin.
EDIT: So you mean "remove", not "exclude". No problem:
result = subject.gsub(/\d+ - /, '')
transforms your input into the desired output. I've taken the liberty to allow more than one digit (after all, if numbers reach 10 or higher, you probably want to remove those entirely, too. Right?).
(Old answer for "historical reasons")
Depending on what you mean by "exclude", it appears that you're looking for negative lookahead assertions:
^(?!.*\d - )
will fail on strings that contain 5 - anywhere and succeed on all other strings:
"5 - " // fail
"5 -" // match
"abc5 - xyz" // fail
"foobar5 - " // fail