EBNF ESCAPE CHARACTERS - ebnf

I'm trying to make the grammar expressions for strings for a pseudo-language based on python, and I'm wondering how I can do the following:
The string starts or either ends with " or ' also, it can include any character except / " ' \n. Those characters can only be included when another backslash is leaded, for example:
'Mark said, \"Boo!\"\n'(accepted)

Related

ruby gsub new line characters

I have a string with newline characters that I want to gsub out for white space.
"hello I\r\nam a test\r\n\r\nstring".gsub(/[\\r\\n]/, ' ')
something like this ^ only my regex seems to be replacing the 'r' and 'n' letters as well. the other constraint is sometimes the pattern repeats itself twice and thus would be replaced with two whitespaces in a row, although this is not preferable it is better than all the text being cut apart.
If there is a way to only select the new line characters. Or even better if there a more rubiestic way of approaching this outside of going to regex?
If you have mixed consecutive line breaks that you want to replace with a single space, you may use the following regex solution:
s.gsub(/\R+/, ' ')
See the Ruby demo.
The \R matches any type of line break and + matches one or more occurrences of the quantified subpattern.
Note that in case you have to deal with an older version of Ruby, you will need to use the negated character class [\r\n] that matches either \r or \n:
.gsub(/[\r\n]+/, ' ')
or - add all possible linebreaks:
/gsub(/(?:\u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029])+/, ' ')
This should work for your test case:
"hello I\r\nam a test\r\n\r\nstring".gsub(/[\r\n]/, ' ')
If you don't want successive \r\n characters to result in duplicate spaces you can use this instead:
"hello I\r\nam a test\r\n\r\nstring".gsub(/[\r\n]+/, ' ')
(Note the addition of the + after the character class.)
As Wiktor mentioned, you're using \\ in your regex, which inside the regex literal /.../ actually escapes a backslash, meaning you're matching a literal backslash \, r, or n as part of your expression. Escaping characters works differently in regex literals, since \ is used so much, it makes no sense to have a special escape for it (as opposed to regular strings, which is a whole different animal).

Regex - How can I remove specific characters between strings/delimiters?

This is related to cleaning files before parsing them elsewhere, namely, malformed/ugly CSV. I see plenty of examples for removing/matching all characters between certain strings/characters/delimiters, but I cannot find any for specific strings. Example portion of line would look something like:
","Should now be allowed by rule above "Server - Access" added by Rich"\r
To be clear, this is not the entire line, but the entire line is enclosed in quotes and separated by "," and ends in ^M (Windows newline/carriage return).The 'columns' preceding this would be enclosed at each side by ",". I would probably use this too to remove cruft that appears earlier in the line.
What I am trying to get to is the removal of all double quotes between "," and "\r ("Server - Access" - these ones) without removing the delimiters. Alternatively, I may just find and replace them with \" to delimit them for the Ruby CSV library. So far I have this:
(?<=",").*?(?="\\r)
Which basically matches everything between the delimiters. If I replace .*? with anything, be that a letter, double quotes etc, I get zero matches. What am I doing wrong?
Note: This should be Ruby compatible please.
If I understand you correctly, you can use negative lookahead and lookbehind:
text = '","Should now be allowed by rule above "Server - Access" added by Rich"\r'
puts text.gsub(/(?<!,)"(?![,\\r])/, '\"')
# ","Should now be allowed by rule above \"Server - Access\" added by Rich"\r
Of course, this won't work if the values themselves can contain comas and new lines...

How come you can't gsub this string in Ruby?

These \\n are showing up in my strings even though it should only be \n.
But if I do this :
"\n".gsub('\\n','\b')
It returns :
"\n"
Ideally, I'm trying to find a regex that could rewrite this string :
"R3pQvDqmz/EQ7zho2mhIeE6UB4dLa6GUH7173VEMdGCcdsRm5pernkqCgbnj\\nZjTX\\n"
To not display two backslashes, but just one like this :
"R3pQvDqmz/EQ7zho2mhIeE6UB4dLa6GUH7173VEMdGCcdsRm5pernkqCgbnj\nZjTX\n"
But any of the regex I do will not work. I can gsub out the \n and put something like X there, but if I put a \ in it, then Ruby escapes it with an additional \ which consequentially destroys my encryption module as it needs to be specific.
Any ideas?
You are falling into the trap of a different meaning of escapes when used in strings with double quotes vs single quotes. Double-quoted strings allow escape characters to be used. Thus, here "\n" actually is a one-character string containing a single line feed. Compare that to '\n' which is a two-character string containing a literal backslash followed by a character n.
This explains, whey your gsub doesn't match. If you use the following code, it should work:
"\\n".gsub('\n','\b')
For your actual issue, you can use this
string = "R3pQvDqmz/EQ7zho2mhIeE6UB4dLa6GUH7173VEMdGCcdsRm5pernkqCgbnj\\nZjTX\\n"
new_string = string.gsub("\\n", "\n")

Ruby string with quotes for shell command args?

Hi I need to create string like this:
drawtext="fontfile=/Users/stpn/Documents/Video_Experiments/fonts/Trebuchet_MS.ttf:text='content':fontsize=100:fontcolor=red:y=h/2"
I want to do something like
str = Q%[drawtext="fontfile=/Users/stpn/Documents/Video_Experiments/fonts/Trebuchet_MS.ttf:text='content':fontsize=100:fontcolor=red:y=h/2"]
I am getting this:
=> "drawtext=\"fontfile=/Users/stpn/Documents/Video_Experiments/fonts/Trebuchet_MS.ttf:text='content':fontsize=100:fontcolor=red:y=h/2\""
The escape characters after equals sign in drawtext=" is what I want to get rid of.. How to achieve that?
The string is to be used in a command line args.
Like many languages, Ruby needs a way of delimiting a quoted quote, and the enclosing quotes.
What you're seeing is the escape character which is a way of saying literal quote instead of syntactic quote:
foo = 'test="test"'
# => "test=\"test\""
The escape character is only there because double-quotes are used by default when inspecting a string. It's stored internally as a single character, of course. You may also see these in other circumstances such as a CR+LF delimited file line:
"example_line\r\n"
The \r and \n here correspond with carriage-return and line-feed characters. There's several of these characters defined in ANSI C that have carried over into many languages including Ruby and JavaScript.
When you output a string those escape characters are not displayed:
puts foo
test="test"

Substitute ' with \' in double-quoted string

The task is simple - I have a string like "I don't know" and I want substitute ' with \' (I know that I don't have to escape single quotes). How can I do it?
Try using the block form, it should work in all versions of Ruby:
s.gsub(/'/) {"\\'"}
# => "I don\\'t know"
[Edit]
The reason is that the gsub method has special handling for backslash sequences in the replacement string which correspond to the special match variables. So you can use $' (and $1, etc.) directly in the substituted string by using the form \\' (and \\1, etc.) instead.
The block form of gsub doesn't have this behavior, so that's the workaround when you're trying to sub in a string that looks like a special backslash sequence.

Resources