Changing "word" to "Word" using a RegEx like [A-Z]([a-z]*)\b - ruby

The title sums up my conundrum pretty well. I've been searching around the net for a while, and being new to Ruby and Regular Expressions as a whole, I'm stuck trying to figure out how to alter the case of a single word string using a RegEx "filter" such as [A-Z]([a-z]*)\b.
Basically I want the flow to be
input: woRD
filter: [A-Z]([a-z]*)\b
output: Word
I already have the words filtered into a list, so I don't need to match words; I only need to filter the case of the word using a RegEx filter.
I do not want to use standard capitalization methods, I want this to be done using Regular Expressions.

You can use
"woRD".downcase.capitalize
Ruby provides some predefined methods for these type of functionality. Try to use them instead of regex. which saves coding time!

Well, for some reason you want to use regexps. Here you go:
# prepare hashes for gsub
to_down = (to_upper = Hash[('a'..'z').zip('A'..'Z')]).invert
# convert to downcase
downcased = 'woRD'.gsub(/[A-Z]/, to_down)
# ⇛ 'word'
titlecased = downcased.gsub(/^\w/, to_upper)
# ⇒ 'Word'
Hope it helps. Note the usage of String#gsub(re, hash) method.

You can't use Regex to such altering as you want to do.
Please read carefully this topic: How to change case of letters in string using regex in Ruby.
The best way to solve your problem is to use:
"woRD".downcase.capitalize
or
name_of_your_variable.downcase!.capitalize!
if you want to alter string in your variable permanently without need of assign it to other variable.

Related

Using a ruby regular expression

I'm completely new to Ruby so I was just wondering if someone could help me out.
I have the following String:
"<planKey><key>OR-J8U</key></planKey>"
What is the regex I have to write to get the center part OR-J8U?
Use the following:
str = "<planKey><key>OR-J8U</key></planKey>"
str[/(?<=\<key\>).*(?=\<\/key\>)/]
#=> "OR-J8U"
This captures anything in between opening and closing 'key' tags using lookahead and lookbehinds
If you want to get the string OR-J8U then you could simply use that string in the regular expression; the - character has to be escaped:
/OR\-J8U/
Though, I believe you want any string that is enclosed within <planKey><key> and </key></planKey>. In that case ice's answer is useful if you allow for an empty string:
/(?<=\<key\>).*(?=\<\/key\>)/
If you don't allow for an empty string, replace the * with +:
/(?<=\<key\>).*(?=\<\/key\>)/
If you prefer a more general approach (any string enclosed within any tags), then I believe the common opinion is not to use a regular expression. Instead consider using an HTML parser. On SO you can find some questions and answers in that regard.

Regular expression to clean string

I'm struggling to figure out even where to start with this. I believe there is a regular expression to make this a fairly straight forward task. I want to trim off the extra asterisks in a string.
Example string:
test="AM*BE*3***LAST****~"
I would like it to trim asterisks off only the end that don't have repeating symbols. So the resulting value in the variable would be:
test="AM*BE*3***LAST~"
In Perl I was able to use this:
s/\*+~+/~/;
Is there something similar I can do in Ruby? I'm sure there is, just struggling to find it for some reason. Any help would be greatly appreciated.
You could use this regex:
/\*+~$/
Then use the gsub method to replace all matches with a tilde ~:
test = "AM*BE*3***LAST****~"
test.gsub!(/\*+~$/, '~')
# => "AM*BE*3***LAST~"
Or you could use this more flexible regex, which matches any amount of characters after * until end of line:
/\*+([^*])+$/
Then use the first capture group ($1) as the replacement:
test.gsub(/\*+([^*])+$/) { $1 }
Ruby's String class has the [] method, which lets us use regexp as a parameter. We can also assign to that, allowing us to do things like:
foo = "AM*BE*3***LAST****~"
foo[/\*+~+$/] = '~'
foo # => "AM*BE*3***LAST~"
That reuses the match pattern from your Perl search/replace. (I'm assuming you only want to match at the end of the line because of your examples. If it needs to be anywhere in the string remove the trailing $ from the pattern.)
You can use Rubular and try to test the regex and achieve what you need based on the references down the page.
http://rubular.com/

Regex can this be achieved

I'm too ambitious or is there a way do this
to add a string if not present ?
and
remove a the same string if present?
Do all of this using Regex and avoid the if else statement
Here an example
I have string
"admin,artist,location_manager,event_manager"
so can the substring location_manager be added or removed with regards to above conditions
basically I'm looking to avoid the if else statement and do all of this plainly in regex
"admin,artist,location_manager,event_manager".test(/some_regex/)
The some_regex will remove location_manager from the string if present else it will add it
Am I over over ambitions
You will need to use some sort of logic.
str += ',location_manager' unless str.gsub!(/location_manager,/,'')
I'm assuming that if it's not present you append it to the end of the string
Regex will not actually add or remove anything in any language that I am aware of. It is simply used to match. You must use some other language construct (a regex based replacement function for example) to achieve this functionality. It would probably help to mention your specific language so as to get help from those users.
Here's one kinda off-the-wall solution. It doesn't use regexes, but it also doesn't use any if/else statements either. It's more academic than production-worthy.
Assumptions: Your string is a comma-separated list of titles, and that these are a unique set (no duplicates), and that order doesn't matter:
titles = Set.new(str.split(','))
#=> #<Set: {"admin", "artist", "location_manager", "event_manager"}>
titles_to_toggle = ["location_manager"]
#=> ["location_manager"]
titles ^= titles_to_toggle
#=> #<Set: {"admin", "artist", "event_manager"}>
titles ^= titles_to_toggle
#=> #<Set: {"location_manager", "admin", "artist", "event_manager"}>
titles.to_a.join(",")
#=> "location_manager,admin,artist,event_manager"
All this assumes that you're using a string as a kind of set. If so, you should probably just use a set. If not, and you actually need string-manipulation functions to operate on it, there's probably no way around except for using if-else, or a variant, such as the ternary operator, or unless, or Bergi's answer
Also worth noting regarding regex as a solution: Make sure you consider the edge cases. If 'location_manager' is in the middle of the string, will you remove the extraneous comma? Will you handle removing commas correctly if it's at the beginning or the end of the string? Will you correctly add commas when it's added? For these reasons treating a set as a set or array instead of a string makes more sense.
No. Regex can only match/test whether "a string" is present (or not). Then, the function you've used can do something based on that result, for example replace can remove a match.
Yet, you want to do two actions (each can be done with regex), remove if present and add if not. You can't execute them sequentially, because they overlap - you need to execute either the one or the other. This is where if-else structures (or ternary operators) come into play, and they are required if there is no library/native function that contains them to do exactly this job. I doubt there is one in Ruby.
If you want to avoid the if-else-statement (for one-liners or expressions), you can use the ternary operator. Or, you can use a labda expression returning the correct value:
# kind of pseudo code
string.replace(/location,?|$/, function($0) return $0 ? "" : ",location" )
This matches the string "location" (with optional comma) or the string end, and replaces that with nothing if a match was found or the string ",location" otherwise. I'm sure you can adapt this to Ruby.
to remove something matching a pattern is really easy:
(admin,?|artist,?|location_manager,?|event_manager,?)
then choose the string to replace the match -in your case an empty string- and pass everything to the replace method.
The other operation you suggested was more difficult to achieve with regex only. Maybe someone knows a better answer

Using Ruby on a string, how can I slice between two parts of the string using RegEx?

I just want to save the text between two specific points in a string into a variable. The text would look like this:
..."content"=>"The text I want to save to a variable"}]...
I suppose I would have to use scan or slice, but not exactly sure how to pull out just the text without grabbing the RegEx identifiers before and after the text. I tried this, but it didn't work:
var = mystring.slice(/\"content\"\=\>\".\"/)
This should do the job
var = mystring[/"content"=>"(.*)"/, 1]
Note that:
.slice aliases []
none of the characters you escaped are special regexp characters where you're using them
you can "group" the bit you want to keep with ()
.slice / [] take a second parameter to pick a matched group
your_text = '"content"=>"The text I want to save to a variable"'
/"content"=>"(?<hooray>.*)"/ =~ your_text
Afterwards, hooray local variable will be magically set to contain your text. Can be used to set multiple variables.
This regex will match your string:
/\"content\"=>\"(.*)\"/
you can try rubular.com for testing
It looks like you're trying to truncate a sentence. You can split the sentence either on punctuation, or even on words.
mystring.split(".")
mystring.split("word")

Very odd issue with Ruby and regex

I am getting completely different reults from string.scan and several regex testers...
I am just trying to grab the domain from the string, it is the last word.
The regex in question:
/([a-zA-Z0-9\-]*\.)*\w{1,4}$/
The string (1 single line, verified in Ruby's runtime btw)
str = 'Show more results from software.informer.com'
Work fine, but in ruby....
irb(main):050:0> str.scan /([a-zA-Z0-9\-]*\.)*\w{1,4}$/
=> [["informer."]]
I would think that I would get a match on software.informer.com ,which is my goal.
Your regex is correct, the result has to do with the way String#scan behaves. From the official documentation:
"If the pattern contains groups, each individual result is itself an array containing one entry per group."
Basically, if you put parentheses around the whole regex, the first element of each array in your results will be what you expect.
It does not look as if you expect more than one result (especially as the regex is anchored). In that case there is no reason to use scan.
'Show more results from software.informer.com'[ /([a-zA-Z0-9\-]*\.)*\w{1,4}$/ ]
#=> "software.informer.com"
If you do need to use scan (in which case you obviously need to remove the anchor), you can use (?:) to create non-capturing groups.
'foo.bar.baz lala software.informer.com'.scan( /(?:[a-zA-Z0-9\-]*\.)*\w{1,4}/ )
#=> ["foo.bar.baz", "lala", "software.informer.com"]
You are getting a match on software.informer.com. Check the value of $&. The return of scan is an array of the captured groups. Add capturing parentheses around the suffix, and you'll get the .com as part of the return value from scan as well.
The regex testers and Ruby are not disagreeing about the fundamental issue (the regex itself). Rather, their interfaces are differing in what they are emphasizing. When you run scan in irb, the first thing you'll see is the return value from scan (an Array of the captured subpatterns), which is not the same thing as the matched text. Regex testers are most likely oriented toward displaying the matched text.
How about doing this :
/([a-zA-Z0-9\-]*\.*\w{1,4})$/
This returns
informer.com
On your test string.
http://rubular.com/regexes/13670

Resources