Are there ways to modify Rstudio's Console behavior when adding missing quotes - rstudio

I've search both SO and Rstudio's community pages and failed to find aquestion, much less and answer to this annoyance I have experienced with Rstudio. (The Rstudio help pages won't let me post a second question within 12 hours of my first, which was explained as a bug.)
If I type:
(test)
... and then realize that test should be quoted, then putting the cursor at the end to test and entering a double-quote will give me two double-quotes "". It will not do this if I first enter a quote between ( and t and then it will also not give me doubling of the double-quote character at the end of `test. Why should it matter whether I first correct my error at the end of the symbol or at the beginning? Is there anything I can do to modify this quirk> It seems that a syntax aware console editor out to be able to tell when a doubling of quotes does not make sense. It's obviously making that "decision" when the quotes are entered between an open-paren and a character. Why not suppress the unhelpful behavior when it is between a character and a close-paren?

Related

asciidoc: Is there a way to get around the problem of lines beginning with [colour] attributes ending with ] not displaying in asciidoc?

My target asciidoc text is this:
[red]#Some prompt[x]# Make sure the option is [checked]
But it won't display in asciidoc
On further investigation, I found that any line beginning with a [colour] in square brackets, and ending in a right-bracket is similarly not displayed.
Now, in this case, I've got around the problem by putting the whole prompt section in bold, like this:
*[red]#Some prompt[x]#* Make sure the option is [checked]
but this is not ideal. Adding a period after the final close bracket \] also AVOIDS the problem - but in my use case I didn't like it.
I'd like to know if there is a better way. So far I've tried:
Escaping the leading open bracket \[
Escaping the final close bracket \]
Removing the [x] in the middle, thinging the additional brackets in the middle may influence the outcome
but none of these has worked.
So my question is:
Is there a way to get around the problem of lines beginning with [colour] attributes ending with ] not displaying in asciidoc?
It seems to me that a line which begins with an opening bracket and ends with a closing bracket is being interpreted as a block attribute line.
There are a number of ways you can mitigate this.
Use a character replacement attribute. There are many built-in attributes, or you can easily define your own.
For example:
[.red]#Some prompt[x]# Make sure the option is [checked{endsb}
Use one of the inline pass-through syntaxes, for example ++:
[.red]#Some prompt[x]# Make sure the option is [checked++]++
Prevent the first opening bracket from being the first character of the line. Also, uses a built-in attribute, and the markup needs to be changed to unconstrained.
For example:
{empty}[.red]##Some prompt[x]## Make sure the option is [checked]

Why is there a starting backtick in bash's unexpected EOF `"' error

Given:
echo '"Number' > temp.sh
./temp.sh
With this script, Bash prints this error message:
./temp.sh: line 1: unexpected EOF while looking for matching `"'
Why does it print out `"' versus something like '"'?
PS: I tried searching for a answer but only got answers to questions asking for help debugging this error. Instead I want to know why the error message prints outs a starting backtick versus a single quote.
Using separate open and close quotes is historically considered good form in English, and an essential part of proper typography. This fell partially out of style due to cost-saving measures (and attempts to conserve the limited 7-bit ASCII character space), but has never completely disappeared.
From Practical Typography:
Curly quotes are the quo­ta­tion marks used in good ty­pog­ra­phy.
From Wikipedia:
"Ambidextrous" quotation marks were introduced on typewriters to reduce the number of keys on the keyboard, and were inherited by computer keyboards and character sets. Some computer systems designed in the past had character sets with proper opening and closing quotes. However, the ASCII character set, which has been used on a wide variety of computers since the 1960s, only contains a straight single quote (U+0027 ' apostrophe) and double quote (U+0022 " quotation mark).
...and, a few paragraphs below, referring specifically to the (mis)use of the backtick as an open quote:
These same systems often drew the grave accent (`, U+0060) as an open quote glyph (actually a high-reversed-9 glyph, to preserve some usability as a grave). This gives a proper appearance at the cost of semantic correctness. Nothing similar was available for the double quote, so many people resorted to using two single quotes for double quotes, which would look like the following: [...]

How to get unstuck in Ruby irb? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is there a way to get out of a “hung” state in IRB?
I am using IRB. When I am coding I noticed that I get "stuck" when the line ends with "/":
irb(main):057:0/
When that happens I cannot do anything, I can't exit, define things, etc. It keeps looping back to lines that end in "/".
But, when a line ends like this:
irb(main):056:0>
everything seems to work fine. I can exit if needed, define anything, etc.
How can I get unstuck when a line ends in "/"?
You can press ctrl + c followed by return to get back IRB's prompt.
When you are 'stuck' in IRB it is usually because of an unmatched closing delimiter, such as a single quote not matching a double quote.
For this specific question it is due to being in a Regexp object, of which the delimiter is a '/' that you have pressed. This is identical in action to having an open quote that you don't close. As soon as you end the forward slash to close the Regexp, you will find yourself on the next prompt, and you will see some return from IRB on the line prior to your cursor location. So, it is appropriate to simply close the delimiter.
Perhaps, and this can only be a fishing expedition, you meant to ignore the return at the end of the line instead, and when you should have used the backslash ('\') you used the forward slash?
Control-C is kind of heavy handed, as it attempts to send an interrupt. Control-D is the EOD or End of Data* character, and so will generally let IRB know that you are done inputting data on the line (or stream).
This works for more than simply IRB and can get you out of some pretty tough places, without terminating the application that is running. Allowing you to have a graceful exit, or even continue running the program, and correcting your mistake, such as happens sometimes in IRB.
Of course, if that fails, then try control-c it will likely be just heavy handed enough to get you through.
*: historically, EOT or "End of Tape" or "End of Transmission". It may be simply my mnemonic to relate it to 'data' as in a stream of input.
That's caused by hitting Enter when the line ends in an unterminated regex. You can either use Ctrl+C like alex said, or complete the regexp by ending it with another slash (or, if you began the regexp with %{, end it with a }, etc.)

Matching an unescaped balanced pair of delimiters

How can I match a balanced pair of delimiters not escaped by backslash (that is itself not escaped by a backslash) (without the need to consider nesting)? For example with backticks, I tried this, but the escaped backtick is not working as escaped.
regex = /(?!<\\)`(.*?)(?!<\\)`/
"hello `how\` are` you"
# => $1: "how\\"
# expected "how\\` are"
And the regex above does not consider a backslash that is escaped by a backslash and is in front of a backtick, but I would like to.
How does StackOverflow do this?
The purpose of this is not much complicated. I have documentation texts, which include the backtick notation for inline code just like StackOverflow, and I want to display that in an HTML file with the inline code decorated with some span material. There would be no nesting, but escaped backticks or escaped backslashes may appear anywhere.
Lookbehind is the first thing everyone thinks of for this kind of problem, but it's the wrong tool, even in flavors like .NET that support unrestricted lookbehinds. You can hack something up, but it's going to be ugly, even in .NET. Here's a better way:
`[^`\\]*(\\.[^`\\]*)*`
The first part starts from the opening delimiter and gobbles up anything that's not the delimiter or a backslash. If the next character is a backslash, it consumes that and the character following it, whatever it may be. It could be the delimiter character, another backslash, or anything else, it doesn't matter.
It repeats those steps as many times as necessary, and when neither [^`\\] nor \\. can match, the next character must be the closing delimiter. Or the end of the string, but I'm assuming the input is well formed. But if it's not well formed, this regex will fail very quickly. I mention that because of this other approach I see a lot:
`(?:[^`\\]+|\\.)*`
This works fine on well-formed input, but what happens if you remove the last backtick from your sample input?
"hello `how\` are you"
According to RegexBuddy, after encountering the first backtick, this regex performed 9,252 distinct operations (or steps) before it could give up and report failure; mine failed in ten steps.
EDIT To extract just the par inside the delimiters, wrap that part in a capturing group. You'll still have to remove the backslashes manually.
`([^`\\]*(?:\\.[^`\\]*)*)`
I also changed the other group to non-capturing, which I should have done from the start. I don't avoid capturing religiously, but if you are using them to capture stuff, any other groups you use should be non-capturing.
EDIT I think I've been reading too much into the question. On StackOverflow, if you want to include literal backticks in an inline-code segment or a comment, you use three backticks as the the delimiter, not just one. Since there's no need to escape backticks, you can ignore backslashes as well. Your regex could turn out to be as simple as this:
```(.*?)```
Dealing with the possibility of false delimiters, you use the same basic technique:
```([^`]*(?:`(?!``)[^`]*)*)```
Is this what you're after?
By the way, this answer doesn't contradict #nneonneo's comment above. This answer doesn't consider the context in which the match is taking place. Is it in the source code of a program or web page? If it is, did the match occur inside a comment or a string literal? How do I even know the first backtick I found wasn't escaped? Regexes don't know anything about the context in which they operate; that's what parsers are for.
If you don't need nesting, regexes can indeed be a proper tool. Lexers of programming languages, for instance, use regexes to tokenize strings, and strings usually allow their own delimiters as an escaped content. Anything more complicated than that will probably need a full-blown parser though.
The "general formula" is to match an escaped character (\\.) or any character that's valid as content but don't need to be escaped ([^{list of invalid chars}]). A "naïve" solution would be joining them with or (|), but for a more efficient variant see #AlanMoore's answer.
The complete example is shown below, in two variants: the first assumes than backslashes should only be used for escaping inside the string, the second assumes that a backslash anywhere in the text escapes the next character.
`((?:\\.|[^`\\])*)`
(?:\\.|[^`\\])*`((?:\\.|[^`\\])*)`
Working examples here and here. However, as #nneonneo commented (and I endorsed), regexes are not meant to do a complete parse, so you'd better keep things simple if you want them to work out right (do you want to find a token in the text, or do you want to delimit it already knowing where it starts? The answer to that question is important to decide which strategy works best for your case).

How to escape unicode characters in bash prompt correctly

I have a specific method for my bash prompt, let's say it looks like this:
CHAR="༇ "
my_function="
prompt=\" \[\$CHAR\]\"
echo -e \$prompt"
PS1="\$(${my_function}) \$ "
To explain the above, I'm builidng my bash prompt by executing a function stored in a string, which was a decision made as the result of this question. Let's pretend like it works fine, because it does, except when unicode characters get involved
I am trying to find the proper way to escape a unicode character, because right now it messes with the bash line length. An easy way to test if it's broken is to type a long command, execute it, press CTRL-R and type to find it, and then pressing CTRL-A CTRL-E to jump to the beginning / end of the line. If the text gets garbled then it's not working.
I have tried several things to properly escape the unicode character in the function string, but nothing seems to be working.
Special characters like this work:
COLOR_BLUE=$(tput sgr0 && tput setaf 6)
my_function="
prompt="\\[\$COLOR_BLUE\\] \"
echo -e \$prompt"
Which is the main reason I made the prompt a function string. That escape sequence does NOT mess with the line length, it's just the unicode character.
The \[...\] sequence says to ignore this part of the string completely, which is useful when your prompt contains a zero-length sequence, such as a control sequence which changes the text color or the title bar, say. But in this case, you are printing a character, so the length of it is not zero. Perhaps you could work around this by, say, using a no-op escape sequence to fool Bash into calculating the correct line length, but it sounds like that way lies madness.
The correct solution would be for the line length calculations in Bash to correctly grok UTF-8 (or whichever Unicode encoding it is that you are using). Uhm, have you tried without the \[...\] sequence?
Edit: The following implements the solution I propose in the comments below. The cursor position is saved, then two spaces are printed, outside of \[...\], then the cursor position is restored, and the Unicode character is printed on top of the two spaces. This assumes a fixed font width, with double width for the Unicode character.
PS1='\['"`tput sc`"'\] \['"`tput rc`"'༇ \] \$ '
At least in the OSX Terminal, Bash 3.2.17(1)-release, this passes cursory [sic] testing.
In the interest of transparency and legibility, I have ignored the requirement to have the prompt's functionality inside a function, and the color coding; this just changes the prompt to the character, space, dollar prompt, space. Adapt to suit your somewhat more complex needs.
#tripleee wins it, posting the final solution here because it's a pain to post code in comments:
CHAR="༇"
my_function="
prompt=\" \\[`tput sc`\\] \\[`tput rc`\\]\\[\$CHAR\\] \"
echo -e \$prompt"
PS1="\$(${my_function}) \$ "
The trick as pointed out in #tripleee's link is the use of the commands tput sc and tput rc which save and then restore the cursor position. The code is effectively saving the cursor position, printing two spaces for width, restoring the cursor position to before the spaces, then printing the special character so that the width of the line is from the two spaces, not the character.
(Not the answer to your problem, but some pointers and general experience related to your issue.)
I see the behaviour you describe about cmd-line editing (Ctrl-R, ... Cntrl-A Ctrl-E ...) all the time, even without unicode chars.
At one work-site, I spent the time to figure out the diff between the terminals interpretation of the TERM setting VS the TERM definition used by the OS (well, stty I suppose).
NOW, when I have this problem, I escape out of my current attempt to edit the line, bring the line up again, and then immediately go to the 'vi' mode, which opens the vi editor. (press just the 'v' char, right?). All the ease of use of a full-fledged session of vi; why go with less ;-)?
Looking again at your problem description, when you say
my_function="
prompt=\" \[\$CHAR\]\"
echo -e \$prompt"
That is just a string definition, right? and I'm assuming your simplifying the problem definition by assuming this is the output of your my_function. It seems very likely in the steps of creating the function definition, calling the function AND using the values returned are a lot of opportunities for shell-quoting to not work the way you want it to.
If you edit your question to include the my_function definition, and its complete use (reducing your function to just what is causing the problem), it may be easier for others to help with this too. Finally, do you use set -vx regularly? It can help show how/wnen/what of variable expansions, you may find something there.
Failing all of those, look at Orielly termcap & terminfo. You may need to look at the man page for your local systems stty and related cmds AND you may do well to look for user groups specific to you Linux system (I'm assuming you use a Linux variant).
I hope this helps.

Resources