delete matched characters using regex in ruby - ruby

I need to write a regex for the following text:
"How can you restate your point (something like: \"<font>First</font>\") as a clear topic?"
that keeps whatever is between the
\" \"
characters (in this case <font>First</font>
I came up with this:
/"How can you restate your point \(something like: |\) as a clear topic\?"/
but how do I get ruby to remove the unwanted surrounding text and only return <font>First</font>?

lookbehind, lookahead and making what is greedy, lazy.
str[/(?<=\").+?(?=\")/] #=> "<font>First</font>"

If you have strings just like that, you can .split and get the first:
> str.split(/"/)[1]
=> "<font>First</font>"

You certainly can use a regular expression, but you don't need to:
str = "How can you restate (like: \"<font>First</font>\") as a clear topic?"
str[str.index('"')+1...str.rindex('"')]
#=> "<font>First</font>"
or, for those like me who never use three dots:
str[str.index('"')+1..str.rindex('"')-1]

Related

Ruby how to remove a more than one space character?

Ruby
Okay, I want to remove a more than one space character in a strings if there's any. What I mean is, let's say I have a text like this:
I want to learn ruby more and more.
See there's a more than one space character after "to" and before "learn" either it a tab or just a several spaces. Now what I want is, how can I know if there's something like this in a text file, and I want to make it just one space per word or string. So it will become like this
I want to learn ruby more and more.
Can I use Gsub? or do I need to use other method? I tried Gsub, but can't figure out how to implement it the right way so it can produce the result I want. Hopefully I explained it clear. Any help is appreciated, thanks.
String#squeeze remove runs of more than one character:
'I want to learn ruby more and more.'.squeeze(' ')
# => "I want to learn ruby more and more."
You can use gsub to replace one or more whitespace (regex / +/) to a single whitespace:
'I want to learn ruby more and more.'.gsub(/ +/, " ")
#=> "I want to learn ruby more and more."
Use this regex to remove all whitespace from a string, including spaces and also tabs. I use this for stripping whitespace from email addresses on login fields.
' I want to learn ruby more and more.'.gsub(/\s/,"")
# => "Iwanttolearnrubymoreandmore."
The /\s/ matches any whitespace character including tabs, whereas / +/ won't.

gsub same pattern from a string

I have big problems with figuring out how regex works.
I want this text:
This is an example\e[213] text\e[123] for demonstration
to become this:
This is an example text for demonstration.
So this means that I want to remove all strings that begin with \e[ and end with ]
I just cant find a proper regex for this.
My current regex looks like this:
/.*?(\\e\[.*\])?.*/ig
But it dont work. I appreciate every help.
You only need to do this:
txt.gsub(/\\e\[[^\]]*\]/i, "")
There is no need to match what is before or after with .*
The second problem is that you use .* to describe the content between brackets. Since the * quantifier is by default greedy, it will match all until the last closing bracket in the same line.
To prevent this behaviour a way is to use a negated character class in place of the dot that excludes the closing square brackets [^\]]. In this way you keep the advantage of using a greedy quantifier.
gsub can do the global matching for you.
re = /\\e\[.+?\]/i
'This is an example\e[213] text\e[123] for demonstration'.gsub re, ''
=> "This is an example text for demonstration"
You can make the search less greedy by using .+? in the regex
puts 'This is an example\e[213] text\e[123] for demonstration'.gsub(/\\e\[.+?\]/, '')
This is an example text for demonstration
=> nil

ruby remove variable length string from regular expression leaving hyphen

I have a string such as this: "im# -33.870816,151.203654"
I want to extract the two numbers including the hyphen.
I tried this:
mystring = "im# -33.870816,151.203654"
/\D*(\-*\d+\.\d+),(\-*\d+\.\d+)/.match(mystring)
This gives me:
33.870816,151.203654
How do I get the hyphen?
I need to do this in ruby
Edit: I should clarify, the "im# " was just an example, there can be any set of characters before the numbers. the numbers are mostly well formed with the comma. I was having trouble with the hyphen (-)
Edit2: Note that the two nos are lattidue, longitude. That pattern is mostly fixed. However, in theory, the preceding string can be arbitrary. I don't expect it to have nos. or hyphen, but you never know.
How about this?
arr = "im# -33.2222,151.200".split(/[, ]/)[1..-1]
and arr is ["-33.2222", "151.200"], (using the split method).
now
arr[0].to_f is -33.2222 and arr[1].to_f is 151.2
EDIT: stripped "im#" part with [1..-1] as suggested in comments.
EDIT2: also, this work regardless of what the first characters are.
If you want to capture the two numbers with the hyphen you can use this regex:
> str = "im# -33.870816,151.203654"
> str.match(/([\d.,-]+)/).captures
=> ["33.870816,151.203654"]
Edit: now it captures hyphen.
This one captures each number separetely: http://rubular.com/r/NNP2OTEdiL
Note: Using String#scan will match all ocurrences of given pattern, in this case
> str.scan /\b\s?([-\d.]+)/
=> [["-33.870816"], ["151.203654"]] # Good, but flattened version is better
> str.scan(/\b\s?([-\d.]+)/).flatten
=> ["-33.870816", "151.203654"]
I recommend you playing around a little with Rubular. There's also some docs about regegular expressions with Ruby:
http://www.ruby-doc.org/docs/ProgrammingRuby/html/language.html#UJ
http://www.regular-expressions.info/ruby.html
http://www.ruby-doc.org/core-1.9.3/Regexp.html
Your regex doesn't work because the hyphen is caught by \D, so you have to modify it to catch only the right set of characters.
[^0-9-]* would be a good option.

How can I remove the string "\n" from within a Ruby string?

I have this string:
"some text\nandsomemore"
I need to remove the "\n" from it. I've tried
"some text\nandsomemore".gsub('\n','')
but it doesn't work. How do I do it? Thanks for reading.
You need to use "\n" not '\n' in your gsub. The different quote marks behave differently.
Double quotes " allow character expansion and expression interpolation ie. they let you use escaped control chars like \n to represent their true value, in this case, newline, and allow the use of #{expression} so you can weave variables and, well, pretty much any ruby expression you like into the text.
While on the other hand, single quotes ' treat the string literally, so there's no expansion, replacement, interpolation or what have you.
In this particular case, it's better to use either the .delete or .tr String method to delete the newlines.
See here for more info
If you want or don't mind having all the leading and trailing whitespace from your string removed you can use the strip method.
" hello ".strip #=> "hello"
"\tgoodbye\r\n".strip #=> "goodbye"
as mentioned here.
edit The original title for this question was different. My answer is for the original question.
When you want to remove a string, rather than replace it you can use String#delete (or its mutator equivalent String#delete!), e.g.:
x = "foo\nfoo"
x.delete!("\n")
x now equals "foofoo"
In this specific case String#delete is more readable than gsub since you are not actually replacing the string with anything.
You don't need a regex for this. Use tr:
"some text\nandsomemore".tr("\n","")
use chomp or strip functions from Ruby:
"abcd\n".chomp => "abcd"
"abcd\n".strip => "abcd"

Ruby gsub : is there a better way

I need to remove all leading and trailing non-numeric characters. This is what I came up with. Is there a better implementation.
puts s.gsub(/^\D+/,'').gsub(/\D+$/,'')
Instead of eliminating what you don't want, it's often clearer to select what you do want (using parentheses). Also, this only requires one regex evaluation:
s.match(/^\D*(.*?)\D*$/)[1]
Or, this convenient shorthand:
s[/^\D*(.*?)\D*$/, 1]
Perhaps a single #gsub(/(^\D+)|(\D+$)/, '')
Also, when in doubt Rubular it.

Resources