I would like to achieve this amazing result (I'm using Ruby):
input: "Joe can't tell between 'large' and large."
output: "Joe can't tell between large and large."
getting rid of the quotes but not of the apostrophe
how can I do it in a simple way?
my failed overcomplicated attempt:
entry = test[0].gsub(/[[']*1]/, "")
Simplest one for your situation could be something like this.
Regex: /\s'|'\s/ and replace with a space.
Regex101 Demo
You can also go with /(['"])([A-Za-z]+)\1/ and replace with \2 i.e second captured group.
Regex101 Demo
Here's a script to demo an answer:
x = "Joe can't tell between 'large' and large."
puts x.gsub(/'\s|\s'/, " ")
# Output: Joe can't tell between large and large.
To decode what this script does - the gsub / regex line is saying:
Find all (an apostrophe followed by a space '/s) or (a space
followed by an apostrophe \s') and replace it with space.
This leaves apostrophes that aren't adjacent to spaces intact, which seems to remove only the apostrophes the OP is trying to remove.
Maybe this one?
entry = test[0].gsub(/[^']/, "")
But it should remove all '.
This does exactly what you are looking for, including ignoring the posted comments Students' example.
entry = test[0].gsub(/'([^\s]+)'/, '\1')
I don't have ruby set up, but i confirmed this works here: http://tryruby.org/levels/1/challenges/0
Here is an example on regex101:
https://regex101.com/r/aY8aJ3/1
Related
I need to write a regex for the following text:
"How can you restate your point (something like: \"<font>First</font>\") as a clear topic?"
that keeps whatever is between the
\" \"
characters (in this case <font>First</font>
I came up with this:
/"How can you restate your point \(something like: |\) as a clear topic\?"/
but how do I get ruby to remove the unwanted surrounding text and only return <font>First</font>?
lookbehind, lookahead and making what is greedy, lazy.
str[/(?<=\").+?(?=\")/] #=> "<font>First</font>"
If you have strings just like that, you can .split and get the first:
> str.split(/"/)[1]
=> "<font>First</font>"
You certainly can use a regular expression, but you don't need to:
str = "How can you restate (like: \"<font>First</font>\") as a clear topic?"
str[str.index('"')+1...str.rindex('"')]
#=> "<font>First</font>"
or, for those like me who never use three dots:
str[str.index('"')+1..str.rindex('"')-1]
Ruby
Okay, I want to remove a more than one space character in a strings if there's any. What I mean is, let's say I have a text like this:
I want to learn ruby more and more.
See there's a more than one space character after "to" and before "learn" either it a tab or just a several spaces. Now what I want is, how can I know if there's something like this in a text file, and I want to make it just one space per word or string. So it will become like this
I want to learn ruby more and more.
Can I use Gsub? or do I need to use other method? I tried Gsub, but can't figure out how to implement it the right way so it can produce the result I want. Hopefully I explained it clear. Any help is appreciated, thanks.
String#squeeze remove runs of more than one character:
'I want to learn ruby more and more.'.squeeze(' ')
# => "I want to learn ruby more and more."
You can use gsub to replace one or more whitespace (regex / +/) to a single whitespace:
'I want to learn ruby more and more.'.gsub(/ +/, " ")
#=> "I want to learn ruby more and more."
Use this regex to remove all whitespace from a string, including spaces and also tabs. I use this for stripping whitespace from email addresses on login fields.
' I want to learn ruby more and more.'.gsub(/\s/,"")
# => "Iwanttolearnrubymoreandmore."
The /\s/ matches any whitespace character including tabs, whereas / +/ won't.
I have big problems with figuring out how regex works.
I want this text:
This is an example\e[213] text\e[123] for demonstration
to become this:
This is an example text for demonstration.
So this means that I want to remove all strings that begin with \e[ and end with ]
I just cant find a proper regex for this.
My current regex looks like this:
/.*?(\\e\[.*\])?.*/ig
But it dont work. I appreciate every help.
You only need to do this:
txt.gsub(/\\e\[[^\]]*\]/i, "")
There is no need to match what is before or after with .*
The second problem is that you use .* to describe the content between brackets. Since the * quantifier is by default greedy, it will match all until the last closing bracket in the same line.
To prevent this behaviour a way is to use a negated character class in place of the dot that excludes the closing square brackets [^\]]. In this way you keep the advantage of using a greedy quantifier.
gsub can do the global matching for you.
re = /\\e\[.+?\]/i
'This is an example\e[213] text\e[123] for demonstration'.gsub re, ''
=> "This is an example text for demonstration"
You can make the search less greedy by using .+? in the regex
puts 'This is an example\e[213] text\e[123] for demonstration'.gsub(/\\e\[.+?\]/, '')
This is an example text for demonstration
=> nil
Consider this example string:
mystr ="1. moody"
I want to capitalize the first letter that occurs in mystr. I am trying this regular expression in Ruby but still returns all the letters in mystr (moody) instead of the letter m only.
puts mystr.scan(/[a-zA-Z]{1}/)
Any help appreciated!
Do as below using String#sub
(arup~>~)$ pry --simple-prompt
>> s = "1. moody"
=> "1. moody"
>> s.sub(/[a-z]/i,&:upcase)
=> "1. Moody"
>>
If you want to modify the source string use s.sub!(/[a-z]/,&:upcase).
Just for completeness, although it doesn’t directly answer your question as posed but could be relevant, consider this variation:
mystr ="1. école"
The line mystr.sub(/[a-z]/i,&:upcase) (as in Arup Rakshit’s answer) will match the second letter of the word, producing
1. éCole
The line mystr.sub /\b\s?[a-zA-Z]{1}/, &:upcase (diego.greyrobot’s answer) won’t match at all and so the line will be unchanged.
There are two problems here. The first is that [a-zA-Z] doesn’t match accented characters, so é isn’t matched. The fix for this is to use the \p{Letter} character property:
mystr.sub /\p{Letter}/, &:upcase
This will match the character in question, but won’t change it. This is due to the second problem, which is that upcase (and downcase) only works on characters in the ASCII range. This is almost as easy to fix, but relies on using an external library such as unicode_utils:
require 'unicode_utils'
mystr.sub(/\p{Letter}/) { |c| UnicodeUtils.upcase(c)}
This results in:
1. École
which is probably what is wanted in this case.
This may not affect you if you are sure all your data is just ASCII, but is worth knowing for other situations.
The reason your attempt returns all the letters is because you are using the scan method which does just that, it returns all the characters which match the regex, in your case letters. For your use case you should use sub since you only want to substitute 1 letter.
I use http://rubular.com to practice my Ruby Regexes. Here's what I came up with http://rubular.com/r/fAQEDFVEVn
The regex is: /\b[a-z]/
It uses \b to find a word boundary, and finally we ask for one letter only with [a-zA-Z]
Finally we'll use sub to replace it with its upcased version:
"1. moody".sub /\b[a-z]/, &:upcase
=> "1. Moody"
Hope that helps.
I'm trying to get the first word in this string: Basic (11/17/2011 - 12/17/2011)
So ultimately wanting to get Basic out of that.
Other example string: Premium (11/22/2011 - 12/22/2011)
The format is always "Single-word followed by parenthesized date range" and I just want the single word.
Use this:
str = "Premium (11/22/2011 - 12/22/2011)"
str.split.first # => "Premium"
The split uses ' ' as default parameter if you don't specify any.
After that, get the first element with first
You don't need regexp for that, you can just use
str.split(' ')[0]
I know you found the answer you are needing but in case anyone stumbles on this in the future, in order to pull the needed value out of a large String of unknown length:
word_you_need = s.slice(/(\b[a-zA-Z]*\b \(\d+\/\d+\/\d+ - \d+\/\d+\/\d+\))/).split[0]
This regular expression will match the first word with out the trailing space
"^\w+ ??"
If you really want a regex you can get the first group after using this regex:
(\w*) .*
"Single-word followed by parenthesized date range"
'word' and 'parenthesized date range' should be better defined
as, by your requirement statement, they should be anchors and/or delimeters.
These raw regex's are just a general guess.
\w+(?=\s*\([^)]*\))
or
\w+(?=\s*\(\s*\d+(?:/\d+)*\s*-\s*\d+(?:/\d+)*\s*\))
Actually, all you need is:
s.split[0]
...or...
s.split.first