Extracting word in with regex - ruby

I want to replace $word with another word in the following string:
"Hello $word How are you"
I used /\$(.*)/, /\$(.*)(\s)/ , /\$(.* \s)/. Due to *, I get the whole string after $, but I only need that word; I need to escape the space. I tried /s,\b, and few other options, but I cannot figure it out. Any help would be appreciated.

* is a greedy operator meaning it will match as much as it can and still allow the remainder of the regular expression to match. The token .* will greedily match every single character in the string. The regex engine will then advance to the next token \s which matches the last whitespace before the word "you" in the string given you a result of word How are.
You can use \S in place of .* which matches any non-whitespace characters.
\$\S+
Or to simply match only word characters, you can use the following:
\$\w+

If you only want to replace "$world" using a regex, try this:
"Hello $word How are you".gsub(/\$word/, 'other_word')
Or:
"Hello $word How are you".sub('$word',"*")
You can read more for gsub here: http://www.ruby-doc.org/core-2.2.0/String.html#method-i-gsub

Substituting placeholder words for other words is usually not done with a regex but with the % method and a hash:
h = {word: "aaa", other_word: "bbb"}
p "Hello %{word} How are you. %{other_word}. Bye %{word}" % h
# => "Hello aaa How are you. bbb. Bye aaa"

Consider:
>> string = "Hello $word How are you"
=> "Hello $word How are you"
>> replace_regex = /(?<replace_word>\$\w+)/
=> /(?<replace_word>\$\w+)/
>> string.gsub(replace_regex, "Bob")
=> "Hello Bob How are you"
>> string.match(replace_regex)[:replace_word]
=> "$word"
Note:
replace_word is the regex with a named capture group.

Related

Regexp for certain character to end of line

I have a string
"So on and so forth $5.99"
I would like to extract everything after the $ until the end of the line.
/$ finds the character $. How do I select the rest of the string? I know it's something \z but I can't get the syntax right.
In regexp $ represents the end of the line.
So in your case you need \$.*$ To include your escaped $ and everything (.*) up until the end of the line $.
No, /$ does not match that character. You need to escape it \ to match a literal.
string = "So on and so forth $5.99"
result = string.match(/\$(.*)$/)
puts result[1] #=> "5.99"
If you want to capture everything after the $, you'll want:
/\$(.*)\z/
See http://rubular.com/r/T4fR1SEl3j

Why $ doesn't match \r\n

Can someone explain this:
str = "hi there\r\n\r\nfoo bar"
rgx = /hi there$/
str.match rgx # => nil
rgx = /hi there\s*$/
str.match rgx # => #<MatchData "hi there\r\n\r">
On the one hand it seems like $ does not match \r. But then if I first capture all the white spaces, which also include \r, then $ suddenly does appear to match the second \r, not continuing to capture the trailing "\nfoo bar".
Is there some special rule here about consecutive \r\n sequences? The docs on $ simply say it will match "end of line" which doesn't explain this behavior.
$ is a zero-width assertion. It doesn't match any character, it matches at a position. Namely, it matches either immediately before a \n, or at the end of string.
/hi there\s*$/ matches because \s* matches "\r\n\r", which allows the $ to match at the position before the second \n. The $ could have also matched at the position before the first \n, but the \s* is greedy and matches as much as it can, while still allowing the overall regex to match.

How to escape newline in regex scan

str = "This\n is a sample text for test"
str.scan(/\S.{0,15}\S(?=\s|$)|\S+/)
# => ["This", "is a sample text", "for test"]
Here, it splits when the newline (\n) is present. I actually want the output as,
["This\n is a", "sample text for", "test"]
How can I achieve that?
Use the /m modifier which allows the dot to match newlines:
str.scan(/\S.{0,15}\S(?=\s|\z)|\S+/m)
Also, I suggest you use \z instead of $ because $ matches the end of a line; \z is the only way to force Ruby to match the end of the string. It doesn't matter in this example, but it's a good habit to get into. Ruby differs from all other regex flavors in these two points.

Why does '\n' not work, and what does $/ mean?

Why doesn't this code work:
"hello \nworld".each_line(separator = '\n') {|s| p s}
while this works?
"hello \nworld".each_line(separator = $/) {|s| p s}
A 10 second google yielded this:
$/ is the input record separator, newline by default.
The first one doesn't work because you used single quotes. Backslash escape sequences are ignored in single quoted strings. Use double quotes instead:
"hello \nworld".each_line(separator = "\n") {|s| p s}
First, newline is the default. All you need is
"hello \nworld".each_line {|s| p s}
Secondly, single quotes behave differently than double quotes. '\n' means a literal backslash followed by the letter n, whereas "\n" means the newline character.
Last, the special variable $/ is the record separator which is "\n" by default, which is why you don't need to specify the separator in the above example.
Simple gsub! your string with valid "\n" new line character:
text = 'hello \nworld'
text.gsub!('\n', "\n")
After that \n character will act like newline character.

How to add a single backslash character to a string in Ruby?

I want to insert backslash before apostrophe in "children's world" string. Is there a easy way to do it?
irb(main):035:0> s = "children's world"
=> "children's world"
irb(main):036:0> s.gsub('\'', '\\\'')
=> "childrens worlds world"
Answer
You need some extra backslashes:
>> puts "children's world".gsub("'", '\\\\\'')
children\'s world
or slightly more concisely (since you don't need to escape the ' in a double-quoted string):
>> puts "children's world".gsub("'", "\\\\'")
children\'s world
or even more concisely:
>> puts "children's world".gsub("'") { "\\'" }
children\'s world
Explanation
Your '\\\'' generates \' as a string:
>> puts '\\\''
\'
and \' is a special replacement pattern in Ruby. From ruby-doc.org:
you may refer to some special match variables using these combinations ... \' corresponds to $', which contains string after match
So the \' that gsub sees in the second argument is being interpreted as a special pattern (everything in the original string after the match) instead of as a literal \'.
So what you want gsub to see is actually \\', which can be produced by '\\\\\'' or "\\\\'".
Or, if you use the block form of gsub (gsub("xxx") { "yyy" }) then Ruby takes the replacement string "yyy" literally without trying to apply replacement patterns.
Note: If you have to create a replacement string with a lot of \s you could take advantage of the fact that when you use /.../ (or %r{...}) you don't have to double-escape the backslashes:
>> puts "children's world".gsub("'", /\\'/.source)
children\'s world
Or you could use a single-quoted heredoc: (using <<'STR' instead of just <<STR)
>> puts "children's world".gsub("'", <<'STR'.strip)
\\'
STR
children\'s world
>> puts s.gsub("'", "\\\\'")
children\'s world
Your problem is that the string "\'" is meaningful to gsub in a replacement string. In order to make it work the way you want, you have to use the block form.
s.gsub("'") {"\\'"}

Resources