Here is my code in question
if (i == '*' || i == '\' || i == '+' || i == '-' )
But after much researching, I can't figure out how to simply check for the equality of i == '\'
Brownie points if you can guide me to a simpler solution. I want to parse a string and change any mathematical operators as such + -> :+, * -> :* ...etc One idea I had in mind would be to have 4 gsub() functions, but that poses two problems (1) still have to find out how to check for equality with '\' and more importantly, (2) I feel like this is much code duplication and from what I hear that is a big stylistic 'no no' in Ruby.
'\\' is how you declare a string with a single character that is a \.
The reason you may think it isn't is because irb reports the inspect value of a string which includes quotes and escape sequences.
irb(main):010:0> '\\'
=> "\\"
However, if you use puts view the string instead, you'll see it's correct:
irb(main):012:0> puts "\\"
\
So you want:
i == '\\'
As for refactoring, I'd recommend an array of operator strings and then checking to see if that operator in in that array.
operators = ['+', '-', '*', '\\']
if operators.member? i
# valid operator
end
But wait... isn't division typically a/b and not a\b? It's a forward slash to hint at numerator over denominator. A backslash wouldn't make much sense. In fact the backslash is used as an escape character due it's near uselessness in most other contexts.
So you should be using '/' which contains no special characters, and works like you expect.
Try
i == '\\'
\ is the escape character, which means that it doesn't normally represent itself, but rather changes what happens to the character immediately following it. For example, you can use it to place a single quote inside a single quoted string:
test = 'hello \'world\''
puts test
# ==> hello 'world'
or to add a newline in a double quoted string:
test = "hello\nworld"
puts test
# ==> hello
# world
So in order to use it by itself, you must escape it, leading to the double slash: \\.
As for a simpler if statement, how about
if %w{* \\ + -}.include?(i)
...
I want to parse a string and change any mathematical operators as such + -> :+, * -> :* ...etc
String#gsub would work:
puts 'a + b * c - d \ e'.gsub(/([+*-\\])/, ':\1')
Output
a :+ b :* c :- d :\ e
Related
I have to clean a string passed in parameter, and remove all lowercase letters, and all special character except :
+
|
^
space
=>
<=>
so i have this string passed in parameter:
aA azee + B => C=
and i need to clean this string to have this result:
A + B => C
I do
string.gsub(/[^[:upper:][+|^ ]]/, "")
output: "A + B C"
I don't know how to select the => (and for <=>) string's with regex in ruby)
I know that if i add string.gsub(/[^[:upper:][+|^ =>]]/, "") into my regex, the last = in my string passed in parameter will be selected too
You can try an alternative approach: matching everything you want to keep then joining the result.
You can use this regex to match everything you want to keep:
[A-Z\d+| ^]|<?=>
As you can see this is just a using | and [] to create a list of strings that you want to keep: uppercase, numbers, +, |, space, ^, => and <=>.
Example:
"aA azee + B => C=".scan(/[A-Z\d+| ^]|<?=>/).join()
Output:
"A + B => C"
Note that there are 2 consecutive spaces between "A" and "+". If you don't want that you can call String#squeeze.
See regex in use here
(<?=>)|[^[:upper:]+|^ ]
(<?=>) Captures <=> or => into capture group 1
[^[:upper:]+|^ ] Matches any character that is not an uppercase letter (same as [A-Z]) or +, |, ^ or a space
See code in use here
p "aA azee + B => C=".gsub(/(<?=>)|[^[:upper:]+|^ ]/, '\1')
Result: A + B => C
r = /[a-z\s[:punct:]&&[^+ |^]]/
"The cat, 'Boots', had 9+5=4 ^lIVEs^ leF|t.".gsub(r,'')
#=> "T B 9+54 ^IVE^ F|"
The regular expression reads, "Match lowercase letters, whitespace and punctuation that are not the characters '+', ' ', '|' and '^'. && within a character class is the set intersection operator. Here it intersects the set of characters that match a-z\s[:punct:] with those that match [^+ |^]. (Note that this includes whitespaces other than spaces.) For more information search for "character classes also support the && operator" in Regexp.
I have not included '=>' and '<=>' as those, unlike '+', ' ', '|' and '^', are multi-character strings and therefore require a different approach than simply removing certain characters.
s = "#main= 'quotes'
s.gsub "'", "\\'" # => "#main= quotes'quotes"
This seems to be wrong, I expect to get "#main= \\'quotes\\'"
when I don't use escape char, then it works as expected.
s.gsub "'", "*" # => "#main= *quotes*"
So there must be something to do with escaping.
Using ruby 1.9.2p290
I need to replace single quotes with back-slash and a quote.
Even more inconsistencies:
"\\'".length # => 2
"\\*".length # => 2
# As expected
"'".gsub("'", "\\*").length # => 2
"'a'".gsub("'", "\\*") # => "\\*a\\*" (length==5)
# WTF next:
"'".gsub("'", "\\'").length # => 0
# Doubling the content?
"'a'".gsub("'", "\\'") # => "a'a" (length==3)
What is going on here?
You're getting tripped up by the specialness of \' inside a regular expression replacement string:
\0, \1, \2, ... \9, \&, \`, \', \+
Substitutes the value matched by the nth grouped subexpression, or by the entire match, pre- or postmatch, or the highest group.
So when you say "\\'", the double \\ becomes just a single backslash and the result is \' but that means "The string to the right of the last successful match." If you want to replace single quotes with escaped single quotes, you need to escape more to get past the specialness of \':
s.gsub("'", "\\\\'")
Or avoid the toothpicks and use the block form:
s.gsub("'") { |m| '\\' + m }
You would run into similar issues if you were trying to escape backticks, a plus sign, or even a single digit.
The overall lesson here is to prefer the block form of gsub for anything but the most trivial of substitutions.
s = "#main = 'quotes'
s.gsub "'", "\\\\'"
Since \it's \\equivalent if you want to get a double backslash you have to put four of ones.
You need to escape the \ as well:
s.gsub "'", "\\\\'"
Outputs
"#main= \\'quotes\\'"
A good explanation found on an outside forum:
The key point to understand IMHO is that a backslash is special in
replacement strings. So, whenever one wants to have a literal
backslash in a replacement string one needs to escape it and hence
have [two] backslashes. Coincidentally a backslash is also special in a
string (even in a single quoted string). So you need two levels of
escaping, makes 2 * 2 = 4 backslashes on the screen for one literal
replacement backslash.
source
Problem
In a source file, I have a large number of strings.ome with interpolation, some with special symbols and some with neither.
I am trying to work out if I can replace the single quotes with double quotes whilst converting escaped single quote characters. I would then run this conversion on one or more source code files.
Example - Code
Imagine the following code:
def myfunc(var, var2 = 'abc')
s = 'something'
puts 'a simple string'
puts 'string with an escaped quote \' in it'
x = "nasty #{interpolated}" + s + ' and single quote combo'
puts "my #{var}"
end
Example - Result
I would like to turn it into this:
def myfunc(var, var2 = "abc")
s = "something"
puts "a simple string"
puts "string with an escaped quote ' in it"
x = "nasty #{interpolated}" + s + " and single quote combo"
puts "my #{var}"
end
If anyone has any ideas I'd be very grateful!
You want negative look behind (?<!) operator:
REGEX
(?<!\)'
DEMO
http://regex101.com/r/rN5eE6
EXPLANATION
You want to replace any single quote not preceded by a backslash.
Don't forget to do a find and replace of all \' with '
THERE IS MORE
For this use case, even if it's a simple use case, a ruby parser would perform better.
As Peter Hamilton pointed out, although replacing single quoted strings with double quoted equivalents might seem as an easy task at first, even that cannot be done easily, if at all, with regexen, mainly thanks to the possibility of single quotes in the "wrong places", such as within double-quoted strings, %q literal string constructs, heredocs, comments...
x = 'puts "foo"'
y = %/puts 'foo'/ # TODO: Replace "x = %/puts 'foo'/" with "x = %#puts 'bar'#"
But the correct solution, in this case, is much easier than the other way around (double quoted to single quoted), and actually partially attainable:
require 'ripper'
require 'sorcerer' # gem install sorcerer if necessary
my_source = <<-source
x = 'puts "foo"'
y = "puts 'bar'"
source
sexp = Ripper::SexpBuilder.new( my_source ).parse
double_quoted_source = Sorcerer.source sexp
#=> "x = \"puts \"foo\"\"; y = \"puts 'bar'\""
The reason why I say "partially attainable" is because, as you can see by yourself,
puts double_quoted_source
#=> x = "puts "foo""; y = "puts 'bar'"
Sorcerer forgets to escape double quotes inside formerly single-quoted string. Feel free to submit a patch
to sorcerer's author Jim Weirich that would fix the problem.
How can I remove from a string all characters except white spaces, numbers, and some others?
Something like this:
oneLine.gsub(/[^ULDR0-9\<\>\s]/i,'')
I need only: 0-9 l d u r < > <space>
Also, is there a good document about the use of regex in Ruby, like a list of special characters with examples?
The regex you have is already working correctly. However, you do need to assign the result back to the string you're operating on. Otherwise, you're not changing the string (.gsub() does not modify the string in-place).
You can improve the regex a bit by adding a '+' quantifier (so consecutive characters can be replaced in one go). Also, you don't need to escape angle brackets:
oneLine = oneLine.gsub(/[^ULDR0-9<>\s]+/i, '')
A good resource with special consideration of Ruby regexes is the Regular Expressions Cookbook by Jan Goyvaerts and Steven Levithan. A good online tutorial by the same author is here.
Good old String#delete does this without a regular expression. The ^ means 'NOT'.
str = "12eldabc8urp pp"
p str.delete('^0-9ldur<> ') #=> "12ld8ur "
Just for completeness: you don't need a regular expression for this particular task, this can be done using simple string manipulation:
irb(main):005:0> "asdasd123".tr('^ULDRuldr0-9<>\t\r\n ', '')
=> "dd123"
There's also the tr! method if you want to replace the old value:
irb(main):009:0> oneLine = 'UasdL asd 123'
irb(main):010:0> oneLine.tr!('^ULDRuldr0-9<>\t\r\n ', '')
irb(main):011:0> oneLine
=> "UdL d 123"
This should be a bit faster as well (but performance shouldn't be a big concern in Ruby :)
I've been working my way through the Ruby Koans and am confused by the "escape clauses and single quoted strings" examples.
One example shows that I can't really use escape characters in this way, but immediately after, the following example is given:
def test_single_quotes_sometimes_interpret_escape_characters
string = '\\\''
assert_equal 2, string.size # <-- my answer is correct according to the program
assert_equal "\\'", string # <-- my answer is correct according to the program
end
This has confused me on two fronts:
Single quotes can sometimes be used with escape characters.
Why is the string size 2, when assert_equal is "\\\'"? (I personally thought the answer was "\'", which would make more sense with regards to size).
You can break your string into two pieces to clarify things:
string = '\\' + '\''
Each part is a string of length one; '\\' is the single character \ and '\'' is the single character '. When you put them together you get the two character string \'.
There are two characters that are special within a single quoted string literal: the backslash and the single quote itself. The single quote character is, of course, used to delimit the string so you need something special to get a single quote into a single quoted string, the something special is the backslash so '\'' is a single quoted string literal that represents a string containing one single quote character. Similarly, if you need to get a backslash into a single quoted string literal you escape it with another backslash so '\\' has length one and contains one backslash.
The single quote character has no special meaning within a double quoted string literal so you can say "'" without any difficulty. The backslash, however, does have a special meaning in double quoted strings so you have to say "\\" to get a single backslash into your double quoted string.
Consider your guess off "\'". The single quote has no special meaning within a double quoted string and escaping something that doesn't need escaping just gives you your something back; so, if c is a character that doesn't need to be escaped within a double quoted string, then \c will be just c. In particular, "\'" evaluates to "'" (i.e. one single quote within a double quoted string).
The result is that:
'\\\'' == "\\'"
"\\\"" == '\\"'
"\'" == '\''
"\'" == "'"
'\\\''.length == 2
"\\\"".length == 2
"\'".length == 1
"'".length == 1
The Wikibooks reference that Kassym gave covers these things.
I usually switch to %q{} (similar to single quoting) or %Q{} (similar to double quoting) when I need to get quotes into strings, all the backslashes make my eyes bleed.
This might be worth a read : http://en.wikibooks.org/wiki/Ruby_Programming/Strings
ruby-1.9.3-p0 :002 > a = '\\\''
=> "\\'"
ruby-1.9.3-p0 :003 > a.size
=> 2
ruby-1.9.3-p0 :004 > puts a
\'
In single quotes there are only two escape characters : \\ and \'.