I tried using a regular expression to capture names:
r[1].scan(/^([A-Z]|[ŞİÇÖÜĞ])([a-z]|[şŞıİçÇöÖüÜĞğ])*\s([A-Z]|[ŞİÇÖÜĞ])([a-z]|[şŞıİçÇöÖüÜĞğ])*/u)
But, it gives me an error:
syntax error, unexpected $end, expecting ')'
...atches = r[1].scan(/^([A-Z]|[ŞİÇÖÜĞ])([a-z]|[şŞ�...
...
I see that the problem is the Turkish characters I'm using. Is it possible to use unicode values of the characters in regexp? How can I use these problematic characters in this regexp?
Use ruby 1.9
Go with /\p{Word}+\p{Space}\p{Word}*/
Related
I'm trying to convert this ruby regex to go.
GROUP_CALL = /^(?<i1>[ \t]*)group\(?[ \t]*(?<grps>#{SYMBOLS})[ \t]*\)?[ \t]+do[ \t]*?\n(?<blk>.*?)\n^\k<i1>end[ \t]*$/m
I converted it to
groupCall := regexp.MustCompile("^(?P<i1>[ \\t]*)group\\(?[ \\t]*(?P<grps>(?::\\w+|:\"[^\"#]+\"|:'[^']+')([ \\t]*,[ \\t]*(?::\\w+|:\"[^\"#]+\"|:'[^']+'))*)[ \\t]*\\)?[ \\t]+do[ \\t]*?\\n(?P<blk>.*?)\\n^\\k<i1>end[ \\t]*$/s")
but when run I get this error
error parsing regexp: invalid escape sequence: \k
There's no mention of \k in the go docs, is there no equivalent in go?
lookbehinds aren't supported neither are backreferences like #stribizhev mentioned.
Regular Expression 2 (RE2) Syntax Reference:
https://github.com/google/re2/wiki/Syntax
The syntax of the regular expressions accepted is the same general
syntax used by Perl, Python, and other languages. More precisely, it
is the syntax accepted by RE2 and described at
//code.google.com/p/re2/wiki/Syntax, except for \C. --GoLang Docs
(ref: https://golang.org/pkg/regexp/)
For code
url.gsub(/"|\[|]| /, '')
ruby raises warning
warning: regular expression has ']' without escape: /"|\[|]| /
How to fix it?
Your regex would be reduced to,
url.gsub(/[ "\[\]]/, '')
I use preg_match to the program like this
if (preg_match('/^[a-z0-9]+\:/{1,2}', $filename))
But it shows an error like this
Warning: preg_match() [function.preg-match]: Unknown modifier '{'
how to change this?
You are missing '/' at the end of the regex and you should escape '/' in the regex itself. This should work i.e. warning should be gone (ignoring if regex you've written is doing what you want):
if (preg_match('/^[a-z0-9]+\:\/{1,2}/', $filename))
I am a newbie in Ruby, I'm using version 1.9.3. I have the following regular expression:
/\\\//
As far as I know, it should match a string which has the characters '\' and '/', one following the other, right?
I am using the following code in order to get true in case the regex matches the string or symbol in the far right:
!(regex !~ :"string or symbol to match")
Because using =~ gives me the index of the match and I simply want a boolean. Besides, I'm trying to see how ugly or hackish can Ruby look compared to C :P
When I try to match the symbol :\/ the IRB prompt changes to an asterisk, and returns nothing. Why?
When I try to match the string "\/" my little ugly snippet returns false. Why?
The symbol :\/ is not a valid symbol. You could do :'\/' if you wanted a symbol version of the string '\/'. And when you feed it "\/" it is false because that has double quotes so it is actually the string '/' so you actually want either '\/' or "\\/".
Finally, it's better code and convention to do your test like so:
!!(regex =~ :'\/')
!!(regex =~ '\/')
!!(regex =~ "\\/")
I am attempting to write a line of code that will take a line of japanese text and delete a certain set of characters. However I am having trouble with using unicode characters inside of the regular expression.
I am currently using text.gsub(/《.*?》/u, '') but I get the error
'gsub': invalid byte sequence in Windows-31J (Argument error)
Can anyone tell me what I am doing incorrectly?
Example text : その仕草《しぐさ》があまりに無造作《むぞうさ》だったので
Expected result: その仕草があまりに無造作だったので
Thanks
edit: # encoding: utf-8 is present at the top of the script.
Try this:
text.encode('utf-8', 'utf-8').gsub(/《.*?》/u, '')