how do I use wildcards in a gsub? - ruby

I used some brute force code to make a roman numeral converter. I'm seeing some drying opportunities with 5s and 10s
out.gsub!('IVI','V')
out.gsub!('IXI','X')
out.gsub!('IXLI','XL')
So what I'd like to do is something like...
out.gsub!(/'I'.'I'/,/./)
Where '.' is any number of characters between two 'I's
Any ideas?

What you're looking for is /I(.*)I/, which will group the string between the Is. You can access that via \1, producing out.gsub!(/I(.*)I/, '\1').
Take a look at the documentation for regular expressions. http://ruby-doc.org/core-2.1.1/Regexp.html

You can use the captures of a regex using \1, \2, etc.:
outs = 'IVVVVI'
out.gsub!(/^I(.*)I$/, '\1')
# => "VVVV"

This might be worth trying:
out = 'IVVVVI'
out.tr('I','')
# => "VVVV"

Related

delete matched characters using regex in ruby

I need to write a regex for the following text:
"How can you restate your point (something like: \"<font>First</font>\") as a clear topic?"
that keeps whatever is between the
\" \"
characters (in this case <font>First</font>
I came up with this:
/"How can you restate your point \(something like: |\) as a clear topic\?"/
but how do I get ruby to remove the unwanted surrounding text and only return <font>First</font>?
lookbehind, lookahead and making what is greedy, lazy.
str[/(?<=\").+?(?=\")/] #=> "<font>First</font>"
If you have strings just like that, you can .split and get the first:
> str.split(/"/)[1]
=> "<font>First</font>"
You certainly can use a regular expression, but you don't need to:
str = "How can you restate (like: \"<font>First</font>\") as a clear topic?"
str[str.index('"')+1...str.rindex('"')]
#=> "<font>First</font>"
or, for those like me who never use three dots:
str[str.index('"')+1..str.rindex('"')-1]

Regular expression to clean string

I'm struggling to figure out even where to start with this. I believe there is a regular expression to make this a fairly straight forward task. I want to trim off the extra asterisks in a string.
Example string:
test="AM*BE*3***LAST****~"
I would like it to trim asterisks off only the end that don't have repeating symbols. So the resulting value in the variable would be:
test="AM*BE*3***LAST~"
In Perl I was able to use this:
s/\*+~+/~/;
Is there something similar I can do in Ruby? I'm sure there is, just struggling to find it for some reason. Any help would be greatly appreciated.
You could use this regex:
/\*+~$/
Then use the gsub method to replace all matches with a tilde ~:
test = "AM*BE*3***LAST****~"
test.gsub!(/\*+~$/, '~')
# => "AM*BE*3***LAST~"
Or you could use this more flexible regex, which matches any amount of characters after * until end of line:
/\*+([^*])+$/
Then use the first capture group ($1) as the replacement:
test.gsub(/\*+([^*])+$/) { $1 }
Ruby's String class has the [] method, which lets us use regexp as a parameter. We can also assign to that, allowing us to do things like:
foo = "AM*BE*3***LAST****~"
foo[/\*+~+$/] = '~'
foo # => "AM*BE*3***LAST~"
That reuses the match pattern from your Perl search/replace. (I'm assuming you only want to match at the end of the line because of your examples. If it needs to be anywhere in the string remove the trailing $ from the pattern.)
You can use Rubular and try to test the regex and achieve what you need based on the references down the page.
http://rubular.com/

gsub same pattern from a string

I have big problems with figuring out how regex works.
I want this text:
This is an example\e[213] text\e[123] for demonstration
to become this:
This is an example text for demonstration.
So this means that I want to remove all strings that begin with \e[ and end with ]
I just cant find a proper regex for this.
My current regex looks like this:
/.*?(\\e\[.*\])?.*/ig
But it dont work. I appreciate every help.
You only need to do this:
txt.gsub(/\\e\[[^\]]*\]/i, "")
There is no need to match what is before or after with .*
The second problem is that you use .* to describe the content between brackets. Since the * quantifier is by default greedy, it will match all until the last closing bracket in the same line.
To prevent this behaviour a way is to use a negated character class in place of the dot that excludes the closing square brackets [^\]]. In this way you keep the advantage of using a greedy quantifier.
gsub can do the global matching for you.
re = /\\e\[.+?\]/i
'This is an example\e[213] text\e[123] for demonstration'.gsub re, ''
=> "This is an example text for demonstration"
You can make the search less greedy by using .+? in the regex
puts 'This is an example\e[213] text\e[123] for demonstration'.gsub(/\\e\[.+?\]/, '')
This is an example text for demonstration
=> nil

Capture float in string using Ruby short-hand regex syntax

I have a Ruby string which contains a dollar amount that I would like to convert into a float. I found a short hand syntax for extracting the float from the string:
"$123.45"[/\d+\.\d+/].to_f
# => 123.45
Now I realize that it does not work when there is a comma in the number:
"$1,023.45"[/\d+\.\d+/].to_f
# => 23.45
How do I change the syntax of this regex to exclude the comma while still keeping the syntax as concise as possible?
You can delete the commas first using String#delete
"$1,023.45".delete(",")[/\d+\.\d+/].to_f
#=> 1023.45
"$1,023.45".gsub(/[\$,]/, '').to_f
# => 1023.45
p "$1,023.45".delete(",$").to_f #=> 1023.45
This regex should do the job [/\d+[,.]\d+/]
[.,] means , or . may be at that position
Update: I thought you mean instead of the . like the eropeans do it. So this might not work for you. You should go for deleting the comma first to avoid a 1,234.56 situation, like the others stated. This can not be solved with regex directly.

ruby remove variable length string from regular expression leaving hyphen

I have a string such as this: "im# -33.870816,151.203654"
I want to extract the two numbers including the hyphen.
I tried this:
mystring = "im# -33.870816,151.203654"
/\D*(\-*\d+\.\d+),(\-*\d+\.\d+)/.match(mystring)
This gives me:
33.870816,151.203654
How do I get the hyphen?
I need to do this in ruby
Edit: I should clarify, the "im# " was just an example, there can be any set of characters before the numbers. the numbers are mostly well formed with the comma. I was having trouble with the hyphen (-)
Edit2: Note that the two nos are lattidue, longitude. That pattern is mostly fixed. However, in theory, the preceding string can be arbitrary. I don't expect it to have nos. or hyphen, but you never know.
How about this?
arr = "im# -33.2222,151.200".split(/[, ]/)[1..-1]
and arr is ["-33.2222", "151.200"], (using the split method).
now
arr[0].to_f is -33.2222 and arr[1].to_f is 151.2
EDIT: stripped "im#" part with [1..-1] as suggested in comments.
EDIT2: also, this work regardless of what the first characters are.
If you want to capture the two numbers with the hyphen you can use this regex:
> str = "im# -33.870816,151.203654"
> str.match(/([\d.,-]+)/).captures
=> ["33.870816,151.203654"]
Edit: now it captures hyphen.
This one captures each number separetely: http://rubular.com/r/NNP2OTEdiL
Note: Using String#scan will match all ocurrences of given pattern, in this case
> str.scan /\b\s?([-\d.]+)/
=> [["-33.870816"], ["151.203654"]] # Good, but flattened version is better
> str.scan(/\b\s?([-\d.]+)/).flatten
=> ["-33.870816", "151.203654"]
I recommend you playing around a little with Rubular. There's also some docs about regegular expressions with Ruby:
http://www.ruby-doc.org/docs/ProgrammingRuby/html/language.html#UJ
http://www.regular-expressions.info/ruby.html
http://www.ruby-doc.org/core-1.9.3/Regexp.html
Your regex doesn't work because the hyphen is caught by \D, so you have to modify it to catch only the right set of characters.
[^0-9-]* would be a good option.

Resources