I got a regex in my code, which is to match pattern of url and threw error:
/^(http|https):\/\/([\w-]+\.)+[\w-]+([\w- .\/?%&=]*)?$/
The error was "empty range in char class error". I found the cause of that is in ([\w- .\/?%&=]*)? part. Ruby seems to recognize - in \w- . as an operator for range instead of a literal -. After adding escape to the dash, the problem was solved.
But the original regular expression ran well on my co-workers' machines. We use the same version of osx, rails and ruby: Ruby version is ruby 1.9.3p194, rails is 3.1.6 and osx is 10.7.5. And after we deployed code to our Heroku server, everything worked fine too. Why did only my environment have error regarding this regex? What is the mechanism of Ruby regex interpreting?
I can replicate this error on Ruby 1.9.3p194 (2012-04-20 revision 35410) [i686-linux], installed on Ubuntu 12.04.1 LTS using rvm 1.13.4. However, this should not be a version-specific error. In fact, I'm surprised it worked on the other machines at all.
A a simpler demonstration that fails just as well:
"abcd" =~ /[\w- ]/
This is because [\w- ] is interpreted as "a range beginning with any word character up to space (or blank)", rather than a character class containing a word, a hyphen, or a space, which is what you had intended.
Per Ruby's regular expression documentation:
Within a character class the hyphen (-) is a metacharacter denoting an inclusive range of characters. [abcd] is equivalent to [a-d]. A range can be followed by another range, so [abcdwxyz] is equivalent to [a-dw-z]. The order in which ranges or individual characters appear inside a character class is irrelevant.
As you saw, prepending a backslash escaped the hyphen, thus changing the nature of the regexp from a range to a character class, removing the error. However, escaping the hyphen in the middle of character class is not recommended, since it's easy to confuse the intended meaning of the hyphen in such cases. As m.buettner pointed out, always place hyphens either at the beginning or the end of a character class:
"abcd" =~ /[-\w ]/
Related
I need to parse a basename in ruby a from file path which I get as input. Unix format works fine on Linux.
File.basename("/tmp/text.txt")
return "text.txt".
However, when I get input in windows format:
File.basename("C:\Users\john\note.txt")
or
File.basename("C:\\Users\\john\\note.txt")
"C:Usersjohn\note.txt" is the output (note that \n is a new line there), but I didn't get "note.txt".
Is there some nice solution in ruby/rails?
Solution:
"C:\\test\\note.txt".split(/\\|\//).last
=> "note.txt"
"/tmp/test/note.txt".split(/\\|\//).last
=> "note.txt"
If the Linux file name doesn't contain \, it will work.
Try pathname:
require 'pathname'
Pathname.new('C:\Users\john\note.txt').basename
# => #<Pathname:note.txt>
Pathname docs
Ref How to get filename without extension from file path in Ruby
I'm not convinced that you have a problem with your code. I think you have a problem with your test.
Ruby also uses the backslash character for escape sequences in strings, so when you type the String literal "C:\Users\john\note.txt", Ruby sees the first two backslashes as invalid escape sequences, and so ignores the escape character. \n refers to a newline. So, to Ruby, this literal is the same as "C:Usersjohn\note.txt". There aren't any file separators in that sequence, since \n is a newline, not a backslash followed by the letter n, so File.basename just returns it as it receives it.
If you ask for user input in either a graphical user interface (GUI) or command line interface (CLI), the user entering input needn't worry about Ruby String escape sequences; those only matter for String literals directly in the code. Try it! Type gets into IRB or Pry, and type or copy a file path, and press Enter, and see how Ruby displays it as a String literal.
On Windows, Ruby accepts paths given using both "/" (File::SEPARATOR) and "\\" (File::ALT_SEPARATOR), so you don't need to worry about conversion unless you are displaying it to the user.
Backslashes, while how Windows expresses things, are just a giant nuisance. Within a double-quoted string they have special meaning so you either need to do:
File.basename("C:\\Users\\john\\note.txt")
Or use single quotes that avoid the issue:
File.basename('C:\Users\john\note.txt')
Or use regular slashes which aren't impacted:
File.basename("C:/Users/john/note.txt")
Where Ruby does the mapping for you to the platform-specific path separator.
This question already has an answer here:
Regex "punct" character class matches different characters depending on Ruby version
(1 answer)
Closed 5 years ago.
[[:punct:]] doesn't match any punctuation when it's called by a rails model test. Using the following code
test "punctuation matched by [[:punct:]]" do
punct_match = /\A[[:punct:]]+\Z/.match('[\]\[!"#$%&\'()*+,./:;<=>?#\^_`{|}~-]')
puts punct_match
puts punct_match.class
end
this outputs a non-printable character and NilClass.
However, if I execute the same statement
punct_match = /\A[[:punct:]]+\Z/.match('[\]\[!"#$%&\'()*+,./:;<=>?#\^_`{|}~-]')
in irb matches correctly and outputs
[\]\[!"#$%&'()*+,./:;<=>?#\^_`{|}~-]
=> nil
What am I missing?
Ruby version => 2.2.4,
Rails version => 4.2.6
The behaviour of /[[:punct:]]/ changed slightly in ruby version 2.4.0.
This bug was raised in the ruby issues, which links back to this (much older) issue in Onigmo - the regexp engine used since Ruby 2.0+.
The short answer is, these characters were not matched by /[[:punct:]]/ in ruby versions <2.4.0, and are now matched:
$+<=>^`|~
You must be running irb in a newer ruby version than this rails application, which is why it matches there.
On a separate note, you should alter your code slightly to:
/\A[[:punct:]]+\z/.match('[]!"#$%&\'()*+,./:;<=>?#^_`{|}~-]')
Use \z, not \Z. There is a slight difference: \Z will also match a new line at the end of the string.
You have unnecessary back-slashes in the string, such as '\^'
You have repeated a [ character: '[\]\['
Aptana is returning:
Invalid escape character syntax
File.open("C:\Users\C*****\Documents\RubyProjects\text.txt
What do I do?
\ is an escape charecter in most languages, so the compiler expects an escaped char after it, in this case its also \, so you just need to use 2 of them
File.open("C:\\Users\\C*****\\Documents\\RubyProjects\\text.txt
Ruby doesn't need you to use reverse slashes. In your string
"C:\Users\C*****\Documents\RubyProjects\text.txt"
you're confusing Ruby because you have reverse-slashes, which denote escapes in a double-quoted (interpreted) string and make Ruby throw up. Instead use:
"C:/Users/C*****/Documents/RubyProjects/text.txt"
From the IO documentation:
Ruby will convert pathnames between different operating system conventions if possible. For instance, on a Windows system the filename "/gumby/ruby/test.rb" will be opened as "\gumby\ruby\test.rb". When specifying a Windows-style filename in a Ruby string, remember to escape the backslashes:
"c:\\gumby\\ruby\\test.rb"
Our examples here will use the Unix-style forward slashes; File::ALT_SEPARATOR can be used to get the platform-specific separator character.
In Ruby 1.8.7 the following regex warning: nested repeat operator + and * was replaced with '*'.
^(\w+\.\w+)\|(\w+\.\w+)\n+*$
It does work in Ruby 2.0 though?
http://rubular.com/r/nRUSP5LNZA
A nested operator works, but is warned because it is useless. \n+* means:
Zero or more repeatition of
One or more repeatition of
\n
which is equivalent to a more simple expression \n*, which means:
Zero or more repeatition of
\n
There is no reason to use \n+*. Ruby regex engine was replaced in Ruby 1.9 and in Ruby 2.0, and if there are any differences, then it is simply that the newer engine does not check for warnings as the older one did.
While the build of 1.8.7 I have seems to have a backported version of Shellwords::shellescape, I know that method is a 1.9 feature and definitely isn't supported in earlier versions of 1.8. Does anyone know where I can find, either in Gem form or just as a snippet, a robust standalone implementation of Bourne-shell command escaping for Ruby?
You might as well just copy what you want from shellwords.rb in the trunk of Ruby's subversion repository (which is GPLv2'd):
def shellescape(str)
# An empty argument will be skipped, so return empty quotes.
return "''" if str.empty?
str = str.dup
# Process as a single byte sequence because not all shell
# implementations are multibyte aware.
str.gsub!(/([^A-Za-z0-9_\-.,:\/#\n])/n, "\\\\\\1")
# A LF cannot be escaped with a backslash because a backslash + LF
# combo is regarded as line continuation and simply ignored.
str.gsub!(/\n/, "'\n'")
return str
end
I wound up going with the Escape gem, which has the additional feature of using quotes by default, and only backslash-escaping when necessary.