Perfect way to write a gsub for a regex match? - ruby

I am trying to write a gsub for a regex match, but I imagine there's a more perfect way to do this .
My equation :
ref.gsub(ref.match(/settings(.*)/)[1], '')
So that I can take this settings/animals, and return just settings.
But what if settings is null? Than my [1] fails as expected.
So how can one write the above statement assuming that sometimes settings won't match ?
So that basically, if it finds the word, settings, than get rid of anything after it. But if it doesn't, no worries.
Thanks!

Why not do the simplest possible thing that could work?
ref.gsub(/(settings)(.*)/, '\1')

Related

Submatching repeating pattern

I am trying to put together a regexp in VBA, but even in ruby I can't get it right.
the string:
<thead class="thead"><tr><th>FECHA</th><th>ITLUPVALOR</th><th>ITLUPPLAZO</th><th>ITLUP30DIAS</th><th>ITLUP60DIAS</th><th>ITLUP90DIAS</th><th>ITLUP180DIAS</th><th>ITLUP270DIAS</th><th>ITLUP360DIAS</th><th>ITLUP720DIAS</th><th>ITLUP1080DIAS</th><th>ITLUP1440DIAS</th><th>ITLUP1800DIAS</th></tr></thead>
what i have tried:
/(?:<thead class=\"thead\"><tr>)(<th>[^<]+?<\/th>)+(?:<\/tr><\/thead>)/m
The idea here (http://rubular.com/r/BpbPszctTw) was to have 9 submatches instead of one.
What am I missing?
Sorry, but a regex repeating group will only capture the last match in a group. See http://www.regular-expressions.info/captureall.html for more info.
Update: True, but if you let the regex match do the repeating for you, as in the other answer, you can get multiple matches, per http://rubular.com/r/BclU13qWYm ! In other words, accept the other answer, not this one. :-)
With this pattern you can obtain what you want:
/<thead class="thead"><tr>|\G<th>([^<]+)<\/th>/
Just remove the first result.

Content Inside Parenthesis Regular Expression Ruby

I'm trying to take out the the content inside the parenthesis. For example, if the string is "(blah blah) This is stack(over)flow", I want to just take out "(blah blah)" but leave "(over)" alone. I'm trying
/\A\(.*\)/
but returns "(blah blah) This is stack(over)", and I'm sure why it's returning that.
Easiest fix:
/\A\(.*?\)/
Normally, * will try to match as much as it possibly can, so it'll match all the way to the last ) in the line. This is called "greedy" matching. Putting ? after +/*/? makes them non-greedy, and they'll match the shortest possible string.
But note that this won't work for nested parentheses. That's rather more complicated. Given your example, I assume this is for a pretty simple ad-hoc format where nesting isn't a concern.

Ruby Regular Expressions: Matching if substring doesn't exist

I'm having an issue trying to capture a group on a string:
"type=gist\nYou need to gist this though\nbecause its awesome\nright now\n</code></p>\n\n<script src=\"https://gist.github.com/3931634.js\"> </script>\n\n\n<p><code>Not code</code></p>\n"
My regex currently looks like this:
/<code>([\s\S]*)<\/code>/
My goal is to get everything in between the code brackets. Unfortunately, it's matching up to the 2nd closing code bracket Is there a way to match everything inside the code brackets up until the first occurrence of ending code bracket?
All repetition quantifiers in regular expressions are greedy by default (matching as many characters as possible). Make the * ungreedy, like this:
/<code>([\s\S]*?)<\/code>/
But please consider using a DOM parser instead. Regex is just not the right tool to parse HTML.
And I just learned that for going through multiple parts, the
String.scan( /<code>(.*?)<\/code>/ ){
puts $1
}
is a very nice way of going through all occurences of code - but yes, getting a proper parser is better...

Ruby regex return match based on negation

I just want to capture the part of the string in nbnbaasd<sd which appears before any a.
I want it to return nbnb as the match.
/.+(?!a)/.match("nbnbaasd<sd") # returns the whole string
Just use a negated character set:
/[^a]+/.match("nbnbaasd<sd")
It's far more efficient than the look-ahead method.
See it here in action: http://regexr.com?32288
It returns the whole string because indeed, "nbnbaasd<sd" is not followed by an "a".
Try this.
/.+?(?=a)/.match("nbnbaasd<sd")
(You do not actually need to use a lookahead to achieve this, but perhaps you've simplified your problem and in your real problem you do need a zero-width assertion for some reason. So this is a solution as close as possible to the one you've attempted.)

Ruby gsub : is there a better way

I need to remove all leading and trailing non-numeric characters. This is what I came up with. Is there a better implementation.
puts s.gsub(/^\D+/,'').gsub(/\D+$/,'')
Instead of eliminating what you don't want, it's often clearer to select what you do want (using parentheses). Also, this only requires one regex evaluation:
s.match(/^\D*(.*?)\D*$/)[1]
Or, this convenient shorthand:
s[/^\D*(.*?)\D*$/, 1]
Perhaps a single #gsub(/(^\D+)|(\D+$)/, '')
Also, when in doubt Rubular it.

Resources