At the moment I am using this line to take a string and extract only the letters from it:
string.scan(/[a-zA-Z]/).to_s
How do I modify this so that the underline character, "_", is also included? Thanks for reading.
Add it within the brackets (the range IIRC).
string.scan(/[a-zA-Z_]/).to_s
Alternative version
string.scan(/[a-z_]/i).to_s
Related
The text is as:
text1text2
How can I specify this text in xpath. I tried:
.//*[#id='someid']//h6[text() ='text1text2]
.//*[#id='someid']//h6[text() ='text1\ntext2]
.//*[#id='someid']//h6[text() ='text1 text2]
None of them worked
Use .//*[#id='someid']//h6[. = 'text1
text2']. This assumes you are writing the path inside of XSLT or XForms where you can use
to escape a new line character. If you are not using XSLT you might want to tell us in which host language (e.g. PHP, C#, Java) you use XPath.
not very elegant but it works
.//*[#id='someid']//h6[contains(text(), 'text1') and contains(text(), 'text2')]
You can use normalize-space() to remove the line feed and compare text without this issue.
//*[#id='someid']//h6[normalize-space(text()) ='text1 text2']
This is the working code
.//*[#id='someid']//h6[. = 'text1text2']
Thank you.
The problem I'm looking at says only inputs with '+' symbols covering any letters in the string is true so like "+d++" or "+d+==+a+" but not
"f++d+"
"3+a=+b+"
"++d+=c+"
I tried to solve this using regex since it's kind of a string pattern matching problem. /(+[a-z][^+])|([^+.][a-z]+)/ but this does not cover patterns where the letters are at the beginning or end of the string. I need help something more comprehensive.
You should try following
/^\+{0,2}[a-z0-9]+\+{0,2}(=*\+{0-2}[a-z0-9]+\+{0,2})*$/
You could use the below regex.
^(?:[^\w\n]*\+[a-z]+\+)+[^\w\n]*$
DEMO
If you want to match +f+g+ also, then put the following + inside a positive lookahead assertion.
^(?:[^\w\n]*\+[a-z]+(?=\+))+[^\w\n]*$
DEMO
So this is my code
convert = contents.gsub(/\\s1(.*?)(\n\\r.*?)?\n((?s)\\ms3(.*?)\\p)/, 'replacement code')
in the first bit: \\s1(.*?)(\n\\r.*?)?\ni only want it to match a newline when i tell it there's one there. But when searching for \\ms3(.*?)\\p i want it to pick up any newlines that are there. Unfortunately it looks like Ruby doesn't support this (?s)prefix. Is there any way of doing this?
thanks
(.*?)==>([\s\S]*?)
You can use this instead of DOTALL modifier.
convert = contents.gsub(/\\s1(.*?)(\n\\r.*?)?\n((\n*)\\ms3(.*?)\\p)/, 'replacement code')
This will capture any(0+) newlines before "\ms3". If it's not what you meant, please, clarify what functionality do you expect from (?s)?
I want to extract #hashtags from a string, also those that have special characters such as #1+1.
Currently I'm using:
#hashtags ||= string.scan(/#\w+/)
But it doesn't work with those special characters. Also, I want it to be UTF-8 compatible.
How do I do this?
EDIT:
If the last character is a special character it should be removed, such as #hashtag, #hashtag. #hashtag! #hashtag? etc...
Also, the hash sign at the beginning should be removed.
The Solution
You probably want something like:
'#hash+tag'.encode('UTF-8').scan /\b(?<=#)[^#[:punct:]]+\b/
=> ["hash+tag"]
Note that the zero-width assertion at the beginning is required to avoid capturing the pound sign as part of the match.
References
String#encode
Ruby's POSIX Character Classes
This should work:
#hashtags = str.scan(/#([[:graph:]]*[[:alnum:]])/).flatten
Or if you don't want your hashtag to start with a special character:
#hashtags = str.scan(/#((?:[[:alnum:]][[:graph:]]*)?[[:alnum:]])/).flatten
How about this:
#hashtags ||=string.match(/(#[[:alpha:]]+)|#[\d\+-]+\d+/).to_s[1..-1]
Takes cares of #alphabets or #2323+2323 #2323-2323 #2323+65656-67676
Also removes # at beginning
Or if you want it in array form:
#hashtags ||=string.scan(/#[[:alpha:]]+|#[\d\+-]+\d+/).collect{|x| x[1..-1]}
Wow, this took so long but I still don't understand why scan(/#[[:alpha:]]+|#[\d\+-]+\d+/) works but not scan(/(#[[:alpha:]]+)|#[\d\+-]+\d+/) in my computer. The difference being the () on the 2nd scan statement. This has no effect as it should be when I use with match method.
The format I'm trying to match is:
# (Apple push notification codes)
"11a735e9 9f696c2f 700b2700 728042c6 137eeb7a 8442c27d 40e59d9e 3c7e0de7"
The simplest expression I can think of is: /((\w{8}\s){7}\w{8})/i
Can anyone think of a simpler one?
(I'm using Ruby regular expressions)
UPDATE - thanks to user1096188, I've removed \d - this is included in \w
You can detect a word boundary using \b, and use (?: to prevent capturing groups
/(?:\w{8}\b\s?){8}/
You could do this if the end of the match is the end of the whole string.
(\w{8}(:?\s|$)){7}
Taking #zapthedingbat's solution one stage further, it looks like the code only contains hexadecimal characters (0-9 and a-f) and spaces. So you could possibly sacrifice a little simplicity for accuracy.
I'm making an assumption, but I suspect letters g to z are invalid.
If the format is hexadecimal only (you should check Apple's documentation to be sure), a tighter match would be:
/(?:[0-9a-f]{8}\b\s?){8}/
EDIT
In fact, in Ruby, it looks like you should be able to do:
/(?:\h{8}\b\s?){8}/
> "11a735e9 9f696c2f 700b2700 728042c6 137eeb7a 8442c27d 40e59d9e 3c7e0de7".match(/((\w{8}\s)+)/)
> $&
=> "11a735e9 9f696c2f 700b2700 728042c6 137eeb7a 8442c27d 40e59d9e 3c7e0de7"