How do I match something that is not a letter or a number or a space? - ruby

I'm using Ruby 2.4. How do I match something that is not a letter or a number or a space? I tried
2.4.0 :004 > str = "-"
=> "-"
2.4.0 :005 > str =~ /[^[:alnum:]]*/
=> 0
2.4.0 :006 > str = " "
=> " "
2.4.0 :007 > str =~ /[^[:alnum:]]*/
=> 0
but as you can see it is still matching a space.

Your /[^[:alnum:]]*/ pattern matches 0 or more symbols other than alphanumeric chars. It will match whitespace.
To match 1 or more chars other than alphanumeric and whitespace, you can use
/[^[:alnum:][:space:]]+/
Use the negated bracket expression with the relevant POSIX character classes inside.

Related

How do I keep the split token in the second part of what was split in Ruby?

In Ruby, how do you split a stirng and keep the token with which you are splitting on in the second part of the result of the split? I have
line.split(/(?<=#{Regexp.escape(split_token)})/)
But the token is getting merged into the first part of teh split and I want it in the second part
2.4.0 :004 > split_token = "aaa"
=> "aaa"
2.4.0 :005 > line = "bbb aaa ccc"
=> "bbb aaa ccc"
2.4.0 :006 > line.split(/(?<=#{Regexp.escape(split_token)})/)
=> ["bbb aaa", " ccc"]
Changing lookbehind ((?<=) to lookahead ((?=) seems to do the trick:
split_token = "aaa"
line = "bbb aaa ccc"
line.split(/(?=#{Regexp.escape(split_token)})/)
# => ["bbb ", "aaa ccc"]
This just changes the split point to before the token rather than after it.
Another possibility is to use slice_before :
line.split.slice_before('aaa').map{|s| s.join(' ')}

Ruby - How to remove space after some characters?

I need to remove white spaces after some characters, not all of them. I want to remove whites spaces after these chars: I,R,P,O. How can I do it?
"I ".gsub(/(?<=[IRPO]) /, "") # => "I"
"A ".gsub(/(?<=[IRPO]) /, "") # => "A "
" P $ R 3I&".gsub(/([IRPO])\s+/,'\1')
#=> " P$ R3I&"

How to use Ruby's gsub function to replace excessive '\n' on a string

I have this string:
string = "SEGUNDA A SEXTA\n05:24 \n05:48\n06:12\n06:36\n07:00\n07:24\n07:48\n\n08:12 \n08:36\n09:00\n09:24\n09:48\n10:12\n10:36\n11:00 \n11:24\n11:48\n12:12\n12:36\n13:00\n13:24\n13:48 \n14:12\n14:36\n15:00\n15:24\n15:48\n16:12\n16:36 \n17:00\n17:24\n17:48\n18:12\n18:36\n19:00\n19:48 \n20:36\n21:24\n22:26\n23:15\n00:00\n"
And I'd like to replace all \n\n occurrences to only one \n and if it's possible I'd like to remove also all " " (spaces) between the numbers and the newline character \n
I'm trying to do:
string.gsub(/\n\n/, '\n')
but it is replacing \n\n by \\n
Can anyone help me?
The real reason is because single quoted sting doesn't escape special characters (like \n).
string.gsub(/\n/, '\n')
It replaces one single character \n with two characters '\' and 'n'
You can see the difference by printing the string:
[302] pry(main)> puts '\n'
\n
=> nil
[303] pry(main)> puts "\n"
=> nil
[304] pry(main)> string = '\n'
=> "\\n"
[305] pry(main)> string = "\n"
=> "\n"
I think you're looking for:
string.gsub( / *\n+/, "\n" )
This searches for zero or more spaces followed by one or more newlines, and replaces the match with a single newline.

Is there a bug in Ruby lookbehind assertions (1.9/2.0)?

Why doesn't the regex (?<=fo).* match foo (whereas (?<=f).* does)?
"foo" =~ /(?<=f).*/m => 1
"foo" =~ /(?<=fo).*/m => nil
This only seems to happen with singleline mode turned on (dot matches newline); without it, everything is OK:
"foo" =~ /(?<=f).*/ => 1
"foo" =~ /(?<=fo).*/ => 2
Tested on Ruby 1.9.3 and 2.0.0.
See it on Rubular
EDIT: Some more observations:
Adding an end-of-line anchor doesn't change anything:
"foo" =~ /(?<=fo).*$/m => nil
But together with a lazy quantifier, it "works":
"foo" =~ /(?<=fo).*?$/m => 2
EDIT: And some more observations:
.+ works as does its equivalent {1,}, but only in Ruby 1.9 (it seems that that's the only behavioral difference between the two in this scenario):
"foo" =~ /(?<=fo).+/m => 2
"foo" =~ /(?<=fo).{1,}/ => 2
In Ruby 2.0:
"foo" =~ /(?<=fo).+/m => nil
"foo" =~ /(?<=fo).{1,}/m => nil
.{0,} is busted (in both 1.9 and 2.0):
"foo" =~ /(?<=fo).{0,}/m => nil
But {n,m} works in both:
"foo" =~ /(?<=fo).{0,1}/m => 2
"foo" =~ /(?<=fo).{0,2}/m => 2
"foo" =~ /(?<=fo).{0,999}/m => 2
"foo" =~ /(?<=fo).{1,999}/m => 2
This has been officially classified as a bug and subsequently fixed, together with another problem concerning \Z anchors in multiline strings.

Match newline `\n` in ruby regex

I'm trying to understand why the following returns false: (** I should have put "outputs 0" **)
puts "a\nb" =~ Regexp.new(Regexp.escape("a\nb"), Regexp::MULTILINE | Regexp::EXTENDED)
Perhaps someone could explain.
I am trying to generate a Regexp from a multi-line String that will match the String.
Thanks in advance
puts will always return nil.
Your code should work fine, albeit lengthy. =~ returns the position of the match which is 0.
You could also use:
"a\nb" =~ /a\sb/m
or
"a\nb" =~ /a\nb/m
Note: The m option isn't necessary in this example but demonstrates how it would be used without Regexp.new.
Probably, puts caused this
1.9.3-194 (main):0 > puts ("a\nb" =~ Regexp.new(Regexp.escape("a\nb"), Regexp::MULTILINE | Regexp::EXTENDED) )
0
=> nil
1.9.3-194 (main):0 > "a\nb" =~ Regexp.new(Regexp.escape("a\nb"), Regexp::MULTILINE | Regexp::EXTENDED)
=> 0

Resources