How to find string containing regex special chars with regex - ruby

I have the fallowing piece of code :
details =~ /.#{action.name}.*/
If action.name contains regular string such as "abcd" then everything goes ok ,
but if action.string contains special chars such as . or / ,im getting an exception.
Is there a way to check the action.name string without having to put \ before every special char inside action.name ?

You can escape all special characters using Regexp::escape.
Try:
details =~ /.#{Regexp.escape(action.name)}.*/

Related

Extracting a string using regular expression

I need to extract a string 'MT/23232' I have written the below code, but
it's not working, Can any one help me here?
'Policy created with MT/1212'
'Policy created with MT/121212'
'Policy created with MT/21212121212'
I have written this code
msg="MT/33235"
id = msg.scan(/MT/\d+/\d+/)[0]
But it's not working for me, Can any one help me to extract this string?
You need to escape the forward slash which exists next to MT in your regex and you don't need to have a forward slash after \d+ . And also i suggest you to add a lookbehind, so that you get a clean result. (?<=\s) Positive lookbehind which asserts that the match must be preceded by a space character.
msg.scan(/(?<=\s)MT\/\d+/)[0]
If you don't care about the preceding character then the below regex would be fine.
msg.scan(/MT\/\d+/)[0]
Example:
> msg = 'Policy created with MT/21212121212'
=> "Policy created with MT/21212121212"
> msg.scan(/(?<=\s)MT\/\d+/)[0]
=> "MT/21212121212"
> msg.match(/(?<=\s)MT\/\d+/)[0]
=> "MT/21212121212"
your_string.scan(/\sMT.*$/).last.strip
If your required substring can be anywhere in the string, then:
your_string.scan(/\bMT\/\d+\b/).last.strip # "\b" is for word boundaries
Or you can specify the acceptable digits this way:
your_string.scan(/\bMT\/[0-9]+\b/).last.strip
Lastly, if the string format is going to remain as you specified, then:
your_string.split.last

PregMatch . space and #?

Can someone tell me, what's wrong in this code:
if ((!preg_match("[a-zA-Z0-9 \.\s]", $username)) || (!preg_match("[a-zA-Z0-9 \.\s]", $password)));
exit("result_message=Error: invalid characters");
}
??
Several things are wrong. I assume that the code you are looking for is:
if (preg_match('~[^a-z0-9\h.]~i', $username) || preg_match('~[^a-z0-9\h.]~i', $password))
exit('result_message=Error: invalid characters');
What is wrong in your code?
the pattern [a-zA-Z0-9 \.\s] is false for multiple reasons:
a regex pattern in PHP must by enclosed by delimiters, the most used is /, but as you can see, I have choosen ~. Example: /[a-zA-Z \.\s]/
the character class is strange because it contains a space and the character class \s that contains the space too. IMO, to check a username or a password, you only need the space and why not the tab, but not the carriage return or the line feed character! You can remove \s and let the space, or you can use the \h character class that matches all horizontal white spaces. /[a-zA-Z\h\.]/ (if you don't want to allow tabs, replace the \h by a space)
the dot has no special meaning inside a character class and doesn't need to be escaped: /[a-zA-Z\h.]/
you are trying to verify a whole string, but your pattern matches a single character! In other words, the pattern checks only if the string contains at least an alnum, a space or a dot. If you want to check all the string you must use a quantifier + and anchors for the start ^ and the end $ of the string. Example ∕^[a-zA-Z0-9\h.]+$/
in fine, you can shorten the character class by using the case-insensitive modifier i: /^[a-z0-9\h.]+$/i
But there is a faster way, instead of negate with ! your preg_match assertion and test if all characters are in the character range you want, you can only test if there is one character you don't want in the string. To do this you only need to negate the character class by inserting a ^ at the first place:
preg_match('/[^a-z0-9\h.]/i', ...
(Note that the ^ has a different meaning inside and outside a character class. If ^ isn't at the begining of a character class, it is a simple literal character.)

regex any non-digit with exception

I've got strings like these:
+996999966966AA
-996999966966AA
I am using this code:
"+996999966966AA".gsub!(/\D/, "")
to get rid of any character except digits, but the sign + also being stripped. How can my code retain the +?
Use:
[^+\d]
to match anything that isn't + or a digit.
You can also use \W, "non-word character" which matches any character that is not a word character (alphanumeric & underscore)).
(\W\d+)\w+

How to match any quoted strings containing Cyrillic symbols

Need parse a lot of text files and replace any quoted strings containing cyrillic symbols. They are may contains new lines, non-alphabetic characters and special symbols (for example '$' or escaped quote).
Can anyone help with regex?
From comments:
for example php code
function hello($word) {
$word2 = "ха-ха!";
echo "Привет, $word $word2\n";
}
hello('Мир');
I need match "ха-ха!", "Привет, $word $word2\n" and 'Мир'
This should work:
str = 'The cat is under the "таблица"'
regex = /"\p{Cyrillic}+.*?\.?"/ui
str.match(regex){|s| do_stuff_with_each_matching s}
# or...
str.gsub!(regex){|s| method_that_translates_russian s}
Check it out on live at http://rubular.com/r/0Mwbfinjvp.
http://www.ruby-doc.org/core-1.9.3/Regexp.html
".*[^a-zA-Z\d]+.*" matches any quoted character sequence containing at least one non-alphanumeric character.
i.e. it matches "aa$bb" and "a1$b1"
It doesn't match "aabb" or a$b.
Hope that this is what you want (Add required escaping).

How to remove the first 4 characters from a string if it matches a pattern in Ruby

I have the following string:
"h3. My Title Goes Here"
I basically want to remove the first four characters from the string so that I just get back:
"My Title Goes Here".
The thing is I am iterating over an array of strings and not all have the h3. part in front so I can't just ditch the first four characters blindly.
I checked the docs and the closest thing I could find was chomp, but that only works for the end of a string.
Right now I am doing this:
"h3. My Title Goes Here".reverse.chomp(" .3h").reverse
This gives me my desired output, but there has to be a better way. I don't want to reverse a string twice for no reason. Is there another method that will work?
To alter the original string, use sub!, e.g.:
my_strings = [ "h3. My Title Goes Here", "No h3. at the start of this line" ]
my_strings.each { |s| s.sub!(/^h3\. /, '') }
To not alter the original and only return the result, remove the exclamation point, i.e. use sub. In the general case you may have regular expressions that you can and want to match more than one instance of, in that case use gsub! and gsub—without the g only the first match is replaced (as you want here, and in any case the ^ can only match once to the start of the string).
You can use sub with a regular expression:
s = 'h3. foo'
s.sub!(/^h[0-9]+\. /, '')
puts s
Output:
foo
The regular expression should be understood as follows:
^ Match from the start of the string.
h A literal "h".
[0-9] A digit from 0-9.
+ One or more of the previous (i.e. one or more digits)
\. A literal period.
A space (yes, spaces are significant by default in regular expressions!)
You can modify the regular expression to suit your needs. See a regular expression tutorial or syntax guide, for example here.
A standard approach would be to use regular expressions:
"h3. My Title Goes Here".gsub /^h3\. /, '' #=> "My Title Goes Here"
gsub means globally substitute and it replaces a pattern by a string, in this case an empty string.
The regular expression is enclosed in / and constitutes of:
^ means beginning of the string
h3 is matched literally, so it means h3
\. - a dot normally means any character so we escape it with a backslash
is matched literally

Resources