Regex to detect period at end of string, but not '...' - ruby

Using a regex, how can I match strings that end with exactly one . as:
This is a string.
but not those that end with more than one . as:
This is a string...
I have a regex that detects a single .:
/[\.]{1}\z/
but I do not want it to match strings that end in ....

What you want is a 'negative lookbehind' assertion:
(?<!\.)\.\z
This looks for a period at the end of a string that isn't preceded by a period. The other answers won't match the following string: "."
Also, you may need to look out for unicode ellipsis characters…
You can detect this like so: str =~ /\u{2026}/

You can use:
[^\.][\.]\z
You are looking for a string that before the last dot there is a char that is not a dot.

I like Regexr a lot!
Solution similar to Dekel:
[^.]+[.]
Live demo

Related

Ruby regex: operator and

I have an string of an email that looks like "<luke#example.com>"
I would like to use regex for deleting "<" and ">", so I wanted something like
"<luke#example.com>".sub /<>/, ""
The problem is quite clear, /<>/ doesn't wrap what I want. I tried with different regex, but I don't know how to choose < AND >, it is there any and operator where I can say: "wrap this and this"?
As written, your regex matches the literal substring "<>" only. You need to use [] to make them a character class so that they're matched individually, and gsub to replace all matches:
"<luke#example.com>".gsub(/[<>]/, "") # => "luke#example.com"
"<luke#example.com>".gsub /[<>]/, ""
http://regex101.com/r/hP3sY2
If you only ever want to strip the < and > from the start and end only, you can use this:
'<luke#example.com>'.sub(/\A<([^<>]+)>\z/, '\1')
You don't need, nor should you use, a regex.
string[1..-2]
is enough.

String gsub - Replace characters between two elements, but leave surrounding elements

Suppose I have the following string:
mystring = "start/abc123/end"
How can you splice out the abc123 with something else, while leaving the "/start/" and "/end" elements intact?
I had the following to match for the pattern, but it replaces the entire string. I was hoping to just have it replace the abc123 with 123abc.
mystring.gsub(/start\/(.*)\/end/,"123abc") #=> "123abc"
Edit: The characters between the start & end elements can be any combination of alphanumeric characters, I changed my example to reflect this.
You can do it using this character class : [^\/] (all that is not a slash) and lookarounds
mystring.gsub(/(?<=start\/)[^\/]+(?=\/end)/,"7")
For your example, you could perhaps use:
mystring.gsub(/\/(.*?)\//,"/7/")
This will match the two slashes between the string you're replacing and putting them back in the substitution.
Alternatively, you could capture the pieces of the string you want to keep and interpolate them around your replacement, this turns out to be much more readable than lookaheads/lookbehinds:
irb(main):010:0> mystring.gsub(/(start)\/.*\/(end)/, "\\1/7/\\2")
=> "start/7/end"
\\1 and \\2 here refer to the numbered captures inside of your regular expression.
The problem is that you're replacing the entire matched string, "start/8/end", with "7". You need to include the matched characters you want to persist:
mystring.gsub(/start\/(.*)\/end/, "start/7/end")
Alternatively, just match the digits:
mystring.gsub(/\d+/, "7")
You can do this by grouping the start and end elements in the regular expression and then referring to these groups in in the substitution string:
mystring.gsub(/(?<start>start\/).*(?<end>\/end)/, "\\<start>7\\<end>")

How to convert a Ruby classname ("NewUserBatch") to string with underscores ("new_user_batch")

I need a generic way to convert classnames to lowercase with underscores. For example, I wish to convert the classname NewUserBatch to new_user_batch. How to do this?
Underscore.
>> 'NewUserBatch'.underscore
=> "new_user_batch"
It is included in Rails so if you don't use it, you can refer to its source code.
def underscore(camel_cased_word)
word = camel_cased_word.to_s.dup
word.gsub!(%r::/, '/')
word.gsub!(%r(?:([A-Za-z\d])|^)(#{inflections.acronym_regex})(?=\b|[^a-z])/) { "#{$1}#{$1 && '_'}#{$2.downcase}" }
word.gsub!(%r([A-Z\d]+)([A-Z][a-z])/,'\1_\2')
word.gsub!(%r([a-z\d])([A-Z])/,'\1_\2')
word.tr!("-", "_")
word.downcase!
word
end
In the simple case, where you only have non-namespaced class names, you can use this oneliner:
EDIT: updated with positive look-ahead assertion (thanks #vladr)
"MYRubyClassName".gsub(/(.)([A-Z](?=[a-z]))/,'\1_\2').downcase
# => "my_ruby_class_name"
This finds all uppercase chars that follow another char and in turn is followed by a lower case char, inserts underscore before it and then downcases everything.
Also a nice tip: to find if you know INPUT (NewUserBatch) and OUTPUT (new_user_batch) use the following method
"NewUserBatch".find_method("new_user_batch")

Separate word Regex Ruby

I have a bunch of input files in a loop and I am extracting tag from them. However, I want to separate some of the words. The incoming strings are in the form cs### where ### => is any number from 0-9. I want the result to be cs ###. The closest answer I found was this, Regex to separate Numeric from Alpha . But I cannot get this to work, as the string is being predefined (Static) and mine changes.
Found answer:
Nevermind, I found the answer the following sperates alpha-numeric characters and removes any unwanted non-alphanumeric characters so anything like ab5#6$% =>ab 56
gsub(/(?<=[0-9])(?=[a-z])|(?<=[a-z])(?=[0-9])/i, ' ').gsub(/[^0-9a-z ]/i, ' ')
If your string is something like
str = "cs3232
cs23
cs423"
Then you can do something like
str.scan(/((cs)(\d{1,10}))/m).collect{|e| e.shift; e }
# [["cs", "3232"], ["cs", "23"], ["cs", "423"]]

Ruby regular expression

Apparently I still don't understand exactly how it works ...
Here is my problem: I'm trying to match numbers in strings such as:
910 -6.258000 6.290
That string should gives me an array like this:
[910, -6.2580000, 6.290]
while the string
blabla9999 some more text 1.1
should not be matched.
The regex I'm trying to use is
/([-]?\d+[.]?\d+)/
but it doesn't do exactly that. Could someone help me ?
It would be great if the answer could clarify the use of the parenthesis in the matching.
Here's a pattern that works:
/^[^\d]+?\d+[^\d]+?\d+[\.]?\d+$/
Note that [^\d]+ means at least one non digit character.
On second thought, here's a more generic solution that doesn't need to deal with regular expressions:
str.gsub(/[^\d.-]+/, " ").split.collect{|d| d.to_f}
Example:
str = "blabla9999 some more text -1.1"
Parsed:
[9999.0, -1.1]
The parenthesis have different meanings.
[] defines a character class, that means one character is matched that is part of this class
() is defining a capturing group, the string that is matched by this part in brackets is put into a variable.
You did not define any anchors so your pattern will match your second string
blabla9999 some more text 1.1
^^^^ here ^^^ and here
Maybe this is more what you wanted
^(\s*-?\d+(?:\.\d+)?\s*)+$
See it here on Regexr
^ anchors the pattern to the start of the string and $ to the end.
it allows Whitespace \s before and after the number and an optional fraction part (?:\.\d+)? This kind of pattern will be matched at least once.
maybe /(-?\d+(.\d+)?)+/
irb(main):010:0> "910 -6.258000 6.290".scan(/(\-?\d+(\.\d+)?)+/).map{|x| x[0]}
=> ["910", "-6.258000", "6.290"]
str = " 910 -6.258000 6.290"
str.scan(/-?\d+\.?\d+/).map(&:to_f)
# => [910.0, -6.258, 6.29]
If you don't want integers to be converted to floats, try this:
str = " 910 -6.258000 6.290"
str.scan(/-?\d+\.?\d+/).map do |ns|
ns[/\./] ? ns.to_f : ns.to_i
end
# => [910, -6.258, 6.29]

Resources