RegExp match fail in Ruby - ruby

I've got a problem with a chatbot in Ruby, there's a command for ban users, and it's supossed to work like writing on the chat
!ban [Username (the username sometimes may have blank spaces)]
[Length of the ban in seconds] [Reason]
like
!ban Chara Cipher 3600 making
flood
and the code is like
match /^ban (.*)(^0-9) (.+)/, :method => :ban
# #param [User] user
# #param [String] target
# #param [Integer] length
# #param [String] reason
def ban(user, target, length, reason)
if user.is? :mod
#client.ban(target, length, reason)
#client.send_msg "#{target} ha sido baneado gracias a la magia de la amistad."
end
end
The problem is that the arguments don't match correctly with every string, maybe because the Regular Expression match part, (.*)(^0-9) (.+).
Does somebody know how to fix it?
Update
https://gist.github.com/carlosqh2/b926e59772e3c28d104d756589acc75e#file-admin-rb-L213
line 214, 255-263, from Admin.rb and line 188 from client.rb are the most relevant lines, also, in lines 202-213 from Admin.rb the "!" is required for the commands to work in the chat

Three issues I see. First, you're matching 'ban' not '!ban'. Second, the first match will just match the entire rest of the string including the time of ban and reason. Third, the pattern for second match is wrong. I suggest explicitly matching the spaces to delimit arguments like ^!ban\s(.+)\s(\d+)\s(.+).

I don't think (^0-9) does what you think it does. In regex it means "capture the literal characters '0-9' at the start of the current line.
Meditate on this:
" 0-9"[/(^0-9)/] # => nil
"0-9"[/(^0-9)/] # => "0-9"
" \n0-9"[/(^0-9)/] # =>
"0-9"
The last one matched the new-line along with 0-9 and returned those, causing the output to fall on the next line.
Instead you probably want [^0-9] which means "a character that is not 0-9" and will match correctly in the middle of strings:
" 0-9"[/[^0-9]/] # => " "
"0-9"[/[^0-9]/] # => "-"
" \n0-9"[/[^0-9]/] # => " "
Read the Regexp documentation and you can piece this all together.

Related

Remove Certain Alphanumeric Characters from a String in Ruby

I have to validate a string based on first alpha-numeric character of the string. Certain characters can be part of the string but if they are at beginning then they have to ignored.
For example:
--- BATest- 1 --
should be:
BATest-1
How do I remove dashes from beginning and end but not from middle?
To add to my question: can the first alphanumeric character decide if following alphanumeric characters are to be removed or not?
I.e. If A then nothing would need to be removed and throw a validation error; and yet if B then strip the string as mentioned above.
r = /
--+ # Match at least two hyphens
| # or
\s # Match a space
/x # Free-spacing regex definition mode
'--- BATest- 1 --'.gsub r, ""
#=> "BATest-1"
You asked to remove the dashes from the beginning and the end:
"--- BATest- 1 --".gsub(/^-+|-+$|\s/, "")
# => "BATest-1"

Need to extract substrings based on key words

I have a string (a block of cdata from a soap) that looks roughly like:
"<![CDATA[XXX|^~\&
KEY|^~\&|xxxxx|xxxxx^xxxx xxxxx
INFO||xxx|xxxxxx||xxxxx|xxxxxxx|xxxxxxx
INFO|||xxxxx||||xxxxxxxxx||||||||||xxxxxxxx
KEY|^~\&|xxxxxx|xxxxxxxxxx|xxxxxxxx
INFO||xx|xxxxxxxx||xxxxxxx|xxxxxx
INFO|||xxxx|x|||xxxxxxxxx|||||||x|||xxxxx|||xxxx||||||||||||||||||||||||xxxx
KEY|^~\&|xxxxx|xxxxx^xxxx xxxxx
INFO||xxx|xxxxxx||xxxxx|xxxxxxx|xxxxxxx
INFO|||xxxxx||||xxxxxxxxx||||||||||xxxxxxxx ]]>"
I am trying to figure how to safely parse out a string for each 'KEY' section using ruby. Basically I need a sting that looks like:
"KEY|^~\&|xxxxx|xxxxx^xxxx xxxxx
INFO||xxx|xxxxxx||xxxxx|xxxxxxx|xxxxxxx
INFO|||xxxxx||||xxxxxxxxx||||||||||xxxxxxxx"
For each time there is a 'KEY'. Thoughts on the best way to go about this? Thanks.
Here's one way to do it (with a simplified example):
str =
"<![CDATA[XXX|^~\&
KEY|^~\&|x
INFO||x
INFO|||x
KEY|^~\&|x
INFO||xx|x
INFO|||x
KEY|^~\&|x
INFO||x
INFO|||x"
r = /
^KEY\b # match KEY at beginning of line followed by word boundary
.+? # match any number of any character, lazily
(?=\bKEY\b|\z) # match KEY bracketed by word boundaries or end of
# string, in positive lookahead
/mx # multiline and extended modes
str.scan r
#=> ["KEY|^~&|x\nINFO||x\nINFO|||x\n",
# "KEY|^~&|x\nINFO||xx|x\nINFO|||x\n",
# "KEY|^~&|x\nINFO||x\nINFO|||x"]
Not as relaxed of a regex as like, but this might work for you:
KEY(.+\n)+(?=\s+KEY)

Regex matching chars around text

I have a string with chars inside and I would like to match only the chars around a string.
"This is a [1]test[/1] string. And [2]test[/2]"
Rubular http://rubular.com/r/f2Xwe3zPzo
Currently, the code in the link matches the text inside the special chars, how can I change it?
Update
To clarify my question. It should only match if the opening and closing has the same number.
"[2]first[/2] [1]second[/2]"
In the code above, only first should match and not second. The text inside the special chars (first), should be ignored.
Try this:
(\[[0-9]\]).+?(\[\/[0-9]\])
Permalink to the example on Rubular.
Update
Since you want to remove the 'special' characters, try this instead:
foo = "This is a [1]test[/1] string. And [2]test[/2]"
foo.gsub /\[\/?\d\]/, ""
# => "This is a test string. And test"
Update, Part II
You only want to remove the 'special' characters when the surrounding tags match, so what about this:
foo = "This is a [1]test[/1] string. And [2]test[/2], but not [3]test[/2]"
foo.gsub /(?:\[(?<number>\d)\])(?<content>.+?)(?:\[\/\k<number>\])/, '\k<content>'
# => "This is a test string. And test, but not [3]test[/2]"
\[([0-9])\].+?\[\/\1\]
([0-9]) is a capture since it is surrounded with parentheses. The \1 tells it to use the result of that capture. If you had more than one capture, you could reference them as well, \2, \3, etc.
Rubular
You can also use a named capture, rather than \1 to make it a little less cryptic. As in: \[(?<number>[0-9])\].+?\[\/\k<number>\]
Here's a way to do it that uses the form of String#gsub that takes a block. The idea is to pull strings such as "[1]test[/1]" into the block, and there remove the unwanted bits.
str = "This is a [1]test[/1] string. And [2]test[/2], plus [3]test[/99]"
r = /
\[ # match a left bracket
(\d+) # capture one or more digits in capture group 1
\] # match a right bracket
.+? # match one or more characters lazily
\[\/ # match a left bracket and forward slash
\1 # match the contents of capture group 1
\] # match a right bracket
/x
str.gsub(r) { |s| s[/(?<=\]).*?(?=\[)/] }
#=> "This is a test string. And test, plus [3]test[/99]"
Aside: When I first heard of named capture groups, they seemed like a great idea, but now I wonder if they really make regexes easier to read than \1, \2....

removing all spaces within a specific string (email address) using ruby

The user is able to input text, but the way I ingest the data it often contains unnecessary carriage returns and spaces.
To remove those to make the input look more like a real sentence, I use the following:
string.delete!("\n")
string = string.squeeze(" ").gsub(/([.?!]) */,'\1 ')
But in the case of the following, I get an unintended space in the email:
string = "Hey what is \n\n\n up joeblow#dude.com \n okay"
I get the following:
"Hey what is up joeblow#dude. com okay"
How can I enable an exception for the email part of the string so I get the following:
"Hey what is up joeblow#dude.com okay"
Edited
your method does the following:
string.squeeze(" ") # replaces each squence of " " by one space
gsub(/([.?!] */, '\1 ') # check if there is a space after every char in the between the brackets [.?!]
# and whether it finds one or more or none at all
# it adds another space, this is why the email address
# is splitted
I guess what you really want by this is, if there is no space after punctuation marks, add one space. You can do this instead.
string.gsub(/([.?!])\W/, '\1 ') # if there is a non word char after
# those punctuation chars, just add a space
Then you just need to replace every sequence of space chars with one space. so the last solution will be:
string.gsub(/([.?!])(?=\W)/, '\1 ').gsub(/\s+/, ' ')
# ([.?!]) => this will match the ., ?, or !. and capture it
# (?=\W) => this will match any non word char but will not capture it.
# so /([.?!])(?=\W)/ will find punctuation between parenthesis that
# are followed by a non word char (a space or new line, or even
# puctuation for example).
# '\1 ' => \1 is for the captured group (i.e. string that match the
# group ([.?!]) which is a single char in this case.), so it will add
# a space after the matched group.
If you are okay with getting rid of the squeeze statement then, using Nafaa's answer is the simplest way to do it but I've listed an alternate method in case its helpful:
string = string.split(" ").join(" ")
However, if you want to keep that squeeze statement you can amend Nafaa's method and use it after the squeeze statement:
string.gsub(/\s+/, ' ').gsub('. com', '.com')
or just directly change the string:
string.gsub('. com', '.com')

Ruby regular expression

Apparently I still don't understand exactly how it works ...
Here is my problem: I'm trying to match numbers in strings such as:
910 -6.258000 6.290
That string should gives me an array like this:
[910, -6.2580000, 6.290]
while the string
blabla9999 some more text 1.1
should not be matched.
The regex I'm trying to use is
/([-]?\d+[.]?\d+)/
but it doesn't do exactly that. Could someone help me ?
It would be great if the answer could clarify the use of the parenthesis in the matching.
Here's a pattern that works:
/^[^\d]+?\d+[^\d]+?\d+[\.]?\d+$/
Note that [^\d]+ means at least one non digit character.
On second thought, here's a more generic solution that doesn't need to deal with regular expressions:
str.gsub(/[^\d.-]+/, " ").split.collect{|d| d.to_f}
Example:
str = "blabla9999 some more text -1.1"
Parsed:
[9999.0, -1.1]
The parenthesis have different meanings.
[] defines a character class, that means one character is matched that is part of this class
() is defining a capturing group, the string that is matched by this part in brackets is put into a variable.
You did not define any anchors so your pattern will match your second string
blabla9999 some more text 1.1
^^^^ here ^^^ and here
Maybe this is more what you wanted
^(\s*-?\d+(?:\.\d+)?\s*)+$
See it here on Regexr
^ anchors the pattern to the start of the string and $ to the end.
it allows Whitespace \s before and after the number and an optional fraction part (?:\.\d+)? This kind of pattern will be matched at least once.
maybe /(-?\d+(.\d+)?)+/
irb(main):010:0> "910 -6.258000 6.290".scan(/(\-?\d+(\.\d+)?)+/).map{|x| x[0]}
=> ["910", "-6.258000", "6.290"]
str = " 910 -6.258000 6.290"
str.scan(/-?\d+\.?\d+/).map(&:to_f)
# => [910.0, -6.258, 6.29]
If you don't want integers to be converted to floats, try this:
str = " 910 -6.258000 6.290"
str.scan(/-?\d+\.?\d+/).map do |ns|
ns[/\./] ? ns.to_f : ns.to_i
end
# => [910, -6.258, 6.29]

Resources