How can one write this gsub regex match? [duplicate] - ruby

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Perfect way to write a gsub for a regex match?
I am trying to write a gsub for a regex match, but I imagine there's a more perfect way to do this .
My equation :
ref.gsub(ref.match(/settings(.*)/)[1], '')
So that I can take this settings/animals, and return just settings.
But what if settings is null? Than my [1] fails as expected.
So how can one write the above statement assuming that sometimes settings won't match ?

Use /(settings|)(.*)/, then first group will return you "settings" or empty string, if it is not present.
puts 'settings/123'.match(/(settings|)(.*)/)[1];
puts 'Xettings/123'.match(/(settings|)(.*)/)[1];

Related

Ruby: how to perform lazy regex matching? [duplicate]

This question already has answers here:
Capturing groups don't work as expected with Ruby scan method
(3 answers)
Closed 5 years ago.
This is a following up question regarding Lazy (ungreedy) matching multiple groups using regex. I try to use the method but not very successful.
I grab a string from gitlab API and try to extract all the repos. The name of repo follows the format of "https://gitlab.example.com/foo/xxx.git".
So far, if I try this, it works OK.
gitlab_str.scan(/\"https\:\/\/gitlab\.example\.com\/foo\//)
But to add name wildcard is tricky, I use the method from the previous question:
gitlab_str.scan(/\"https\:\/\/gitlab\.example\.com\/foo\/(.*?)\.git\"/)
It says to use (.*?) for lazy matching, but it doesn't seem to work.
Thanks a lot for the help.
If we have the following string:
gitlab_str = "\"https://gitlab.example.com/foo/xxx.git\""
The following RegEx will return [["xxx"]], which is expected:
gitlab_str.scan(/\"https\:\/\/gitlab\.example\.com\/foo\/(.*?)\.git\"/)
Because you had the (.*?). Note the parenthesis, so only what's inside the parenthesis will be returned.
If you want to return the whole string matched, you can just remove the parenthesis:
gitlab_str.scan(/\"https\:\/\/gitlab\.example\.com\/foo\/.*?\.git\"/)
This will return:
["\"https://gitlab.example.com/foo/xxx.git\""]
It also works for multiple occurrences:
> gitlab_str = "\"https://gitlab.example.com/foo/xxx.git\" and \"https://gitlab.example.com/foo/yyy.git\""
> gitlab_str.scan(/\"https\:\/\/gitlab\.example\.com\/foo\/.*?\.git\"/)
=> ["\"https://gitlab.example.com/foo/xxx.git\"", "\"https://gitlab.example.com/foo/yyy.git\""]
Finally, if you want to remove the https:// part from the resulting matches, then just wrap everything but that part with () in the RegEx:
gitlab_str.scan(/\"https\:\/\/(gitlab\.example\.com\/foo\/.*?\.git)\"/)

Understanding Ruby match method of Regexp class [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 6 years ago.
I was reading about the match method in ruby, I understood most of the example given at Regexp
But I am failing to understand, why is:
/[0-9a-f]/.match('9f')
=> #<MatchData "9">
And not:
=> #<MatchData "9f">
I might be missing some basic understanding of Regex, so bear with me.
Because you're asking it to match a single character of class 0-9 or a-f.
If you want to match multiple use a plus or an asterisk after the character classes e.g. /[0-9a-f]+/.match('9f')
It's all here.

extracting link from text [duplicate]

This question already has answers here:
How to extract URLs from text
(6 answers)
Closed 8 years ago.
I am tring to extract a link from a phrase and it could be any where last, first or middle so I am usig this regex
link=text.scan(/(^| )(http.*)($| )/)
but the problem is when the link is in the middle it gets the whole phrase until the end.
What should I do ?
It's because .* next to http is greedy. I suggest you to use lookarounds.
link=text.scan(/(?<!\S)(http\S+)(?!\S)/)
OR
link=text.scan(/(?<!\S)(http\S+)/)
Example:
> "http://bar.com foo http://bar.com bar http://bar.com".scan(/(?<!\S)http\S+(?!\S)/)
=> ["http://bar.com", "http://bar.com", "http://bar.com"]
DEMO
(?<!\S) Negative lookbehind which asserts that the match won't be preceeded by a non-space character.
http\S+ Matches the substring http plus the following one or more non-space characters.
Do all the links you are trying to match follow some simple pattern? We'd need to see more context to confidently provide a good solution to your problem.
For example, the regex:
link=text.scan(/http.*\.com/)
...might be good enough for the job (this assumes all links end in ".com"), but I can't say for sure without more information.
Or again, for example, perhaps you could use something like:
link=text.scan(/http[a-z./:]*) - this assumes all links contain only lower case letters, ".", "/" and ":".

Ruby Regex not matching what it should be [duplicate]

This question already has answers here:
How to match all occurrences of a regular expression in Ruby
(6 answers)
Closed 8 years ago.
I've got the following regex:
regex = /\$([a-zA-Z.]+)/
and the following query
query = "Show me the PE Ratio for $AAPL, $TSLA"
Now regex.match(query) should capture AAPL and TSLA, but instead I get the following:
#<MatchData "$AAPL" 1:"AAPL">
which is completely wrong. Anyone know why?
Note that this regex works fine on Rubular: http://rubular.com/r/j0maQHnVFF
In Ruby the .match method will only return the first capture. You need it to return all captured matches, like the /g flag in PCRE
You can use the scan method. The scan method will either give you an array of all the matches or, if you pass it a block, pass each match to the block.
Code
query.scan(/\$([a-zA-Z.]+)/)
Fixed it, needed to use .scan instead of .match

what is the best way to remove the last n characters of a string (in Ruby)? [duplicate]

This question already has answers here:
Ruby, remove last N characters from a string?
(13 answers)
Closed 5 years ago.
in Ruby,
I just want to get rid of the last n characters of a string,
but the following doesn't work
"string"[0,-3]
nor
"string".slice(0, -3)
I'd like a clean method, not anything like
"string".chop.chop.chop
it may be trivial, please anyone teach me! thanks!
You can use ranges.
"string"[0..-4]
You could use a regex with gsub ...
"string".gsub( /.{3}$/, '' )
If you add an ! to slice it will destructively remove the last n characters without having to assign it to another variable:
my_string.slice!(my_string.length-3,my_string.length)
compared to:
new = my_string.slice(0..-4)

Resources