In my experience, it's common to see spaces put inside braces for one-line definitions, e.g. this function in JavaScript:
function(a, b) { return a * b; }
Is there any technical/historical reason that most programmers seem to do this, particularly given that spaces are not included inside parentheses?
Besides readability, in some languages, such as Verilog, identifiers can be escaped (by a \ at their beginning) so that they use special characters in their names. For example, the following names are legal identifier names in Verilog:
q
\q~ //escaped version which uses ~ in the name
\element[32] //a single variable (not part of an array) whose name is \element[32]
Such identifiers, should always terminate by space, otherwise the character after them would be considered as the identifier's name:
{ d, \q~ } // Concatenating d and \q~ in a vector
{ d, \q~} // Concatenating d and \q~} in a vector. Will generate a missing brace error.
Spaces are mostly used for readability. Most of the coding styles will tell you to judiciously use whitespace in your code to make it more readable.
As an example, look at this statement: *a = *b + *c;. Think how it will seem without the whitespace.
Related
I need to create a string from a full POSIX path (starting at the root), so that it could be pasted directly into a Unix shell like bash, e.g. in Terminal.app, without the need for quotes around the path.
(I do not actually pass the string to a shell, but instead need it for passing it to another program. That program expects the path in just the form that you get when you drag a file into Terminal.app.)
For that, I need to escape at least any spaces in the string, by prepending them with a backslash. And some more characters as well.
For example, this path:
/directory/-as"<>' *+
Would be escaped as follows:
/directory/-as\"\<\>\'\ \*+
What's a safe algorithm to perform that conversion? I could escape every character, but that would be overkill.
There seems to be no framework function for doing this, so I'll need to do the replacing with string operations.
To be conservative (for the most popular shells), while also avoiding clearly unnecessary escapings, what set of characters should be escaped?
For the record, Terminal.app escapes the following non-control ASCII chars when dropping a file name into its window:
Space
!"#$%&'()*,:;<=>?[]`{|}~
And these are not escaped:
Control codes (00-1F and 7F)
Alphanumerical
+-.#^_
And here's the code that would perform the replacement:
NSString* shellPathFromPOSIXPath (NSString *path)
{
static NSRegularExpression *regex = nil;
if (!regex) {
NSString *pattern =
#"([ !\\\"\\#\\$\\%\\&\\'\\(\\)\\*\\,\\:\\;\\<\\=\\>\\?\\[\\]\\`\\{\\|\\}\\~])";
regex =
[NSRegularExpression regularExpressionWithPattern:pattern options:0 error:nil];
}
NSString *result =
[regex stringByReplacingMatchesInString:path
options:0
range:NSMakeRange(0, path.length)
withTemplate:#"\\\\$1"];
return result;
}
Better to put the whole thing in single quotes, rather than adding backslashes to individual characters; then the only character you need to escape is a single-quote present inside the string.
The Python standard library's implementation, provided as an example which can be easily reimplemented in any other language having only basic primitives, reads as follows:
def quote(s):
"""Return a shell-escaped version of the string *s*."""
if not s:
return "''"
if _find_unsafe(s) is None:
return s
# use single quotes, and put single quotes into double quotes
# the string $'b is then quoted as '$'"'"'b'
return "'" + s.replace("'", "'\"'\"'") + "'"
That is to say, the general algorithm is as follows:
An empty string becomes '' (a pair of literal single-quotes).
A string which is known to be safe (though it's safest to not try to implement a codepath for this at all, particularly as shells often implement their own syntax extensions in undefined space) can be emitted bare/unquoted.
Otherwise, prepend a ', emit your input string with all 's replaced with the literal string '"'"', and then append a final '.
That's it. You don't need to escape backslashes (they're literal inside single quotes), newlines (likewise), or anything else.
I have a struct:
type nameSorter struct {
names []Name
by func(s1, s2 *Name) bool
Which is used in this method. What is going on with that comma? If I remove it there is a syntax error.
func (by By) Sort(names []Name) {
sorter := &nameSorter{
names: names,
by: by, //why does there have to be a comma here?
}
sort.Sort(sorter)
Also, the code below works perfectly fine and seems to be more clear.
func (by By) Sort(names []Name) {
sorter := &nameSorter{names, by}
sort.Sort(sorter)
For more context this code is part of a series of declarations for sorting of a custom type that looks like this:
By(lastNameSort).Sort(Names)
This is how go works, and go is strict with things like comma and parentheses.
The good thing about this notion is that when adding or deleting a line, it does not affect other line. Suppose the last comma can be omitted, if you want to add a field after it, you have to add the comma back.
See this post: https://dave.cheney.net/2014/10/04/that-trailing-comma.
From https://golang.org/doc/effective_go.html#semicolons:
the lexer uses a simple rule to insert semicolons automatically as it scans, so the input text is mostly free of them
In other words, the programmer is unburdened from using semicolons, but Go still uses them under the hood, prior to compilation.
Semicolons are inserted after the:
last token before a newline is an identifier (which includes words like int and float64), a basic literal such as a number or string constant, or one of the tokens break continue fallthrough return ++ -- ) }
Thus, without a comma, the lexer would insert a semicolon and cause a syntax error:
&nameSorter{
names: names,
by: by; // semicolon inserted after identifier, syntax error due to unclosed braces
}
I would like to find a regex that will pick out all commas that fall outside quote sets.
For example:
'foo' => 'bar',
'foofoo' => 'bar,bar'
This would pick out the single comma on line 1, after 'bar',
I don't really care about single vs double quotes.
Has anyone got any thoughts? I feel like this should be possible with readaheads, but my regex fu is too weak.
This will match any string up to and including the first non-quoted ",". Is that what you are wanting?
/^([^"]|"[^"]*")*?(,)/
If you want all of them (and as a counter-example to the guy who said it wasn't possible) you could write:
/(,)(?=(?:[^"]|"[^"]*")*$)/
which will match all of them. Thus
'test, a "comma,", bob, ",sam,",here'.gsub(/(,)(?=(?:[^"]|"[^"]*")*$)/,';')
replaces all the commas not inside quotes with semicolons, and produces:
'test; a "comma,"; bob; ",sam,";here'
If you need it to work across line breaks just add the m (multiline) flag.
The below regexes would match all the comma's which are present outside the double quotes,
,(?=(?:[^"]*"[^"]*")*[^"]*$)
DEMO
OR(PCRE only)
"[^"]*"(*SKIP)(*F)|,
"[^"]*" matches all the double quoted block. That is, in this buz,"bar,foo" input, this regex would match "bar,foo" only. Now the following (*SKIP)(*F) makes the match to fail. Then it moves on to the pattern which was next to | symbol and tries to match characters from the remaining string. That is, in our output , next to pattern | will match only the comma which was just after to buz . Note that this won't match the comma which was present inside double quotes, because we already make the double quoted part to skip.
DEMO
The below regex would match all the comma's which are present inside the double quotes,
,(?!(?:[^"]*"[^"]*")*[^"]*$)
DEMO
While it's possible to hack it with a regex (and I enjoy abusing regexes as much as the next guy), you'll get in trouble sooner or later trying to handle substrings without a more advanced parser. Possible ways to get in trouble include mixed quotes, and escaped quotes.
This function will split a string on commas, but not those commas that are within a single- or double-quoted string. It can be easily extended with additional characters to use as quotes (though character pairs like « » would need a few more lines of code) and will even tell you if you forgot to close a quote in your data:
function splitNotStrings(str){
var parse=[], inString=false, escape=0, end=0
for(var i=0, c; c=str[i]; i++){ // looping over the characters in str
if(c==='\\'){ escape^=1; continue} // 1 when odd number of consecutive \
if(c===','){
if(!inString){
parse.push(str.slice(end, i))
end=i+1
}
}
else if(splitNotStrings.quotes.indexOf(c)>-1 && !escape){
if(c===inString) inString=false
else if(!inString) inString=c
}
escape=0
}
// now we finished parsing, strings should be closed
if(inString) throw SyntaxError('expected matching '+inString)
if(end<i) parse.push(str.slice(end, i))
return parse
}
splitNotStrings.quotes="'\"" // add other (symmetrical) quotes here
Try this regular expression:
(?:"(?:[^\\"]+|\\(?:\\\\)*[\\"])*"|'(?:[^\\']+|\\(?:\\\\)*[\\'])*')\s*=>\s*(?:"(?:[^\\"]+|\\(?:\\\\)*[\\"])*"|'(?:[^\\']+|\\(?:\\\\)*[\\'])*')\s*,
This does also allow strings like “'foo\'bar' => 'bar\\',”.
MarkusQ's answer worked great for me for about a year, until it didn't. I just got a stack overflow error on a line with about 120 commas and 3682 characters total. In Java, like this:
String[] cells = line.split("[\t,](?=(?:[^\"]|\"[^\"]*\")*$)", -1);
Here's my extremely inelegant replacement that doesn't stack overflow:
private String[] extractCellsFromLine(String line) {
List<String> cellList = new ArrayList<String>();
while (true) {
String[] firstCellAndRest;
if (line.startsWith("\"")) {
firstCellAndRest = line.split("([\t,])(?=(?:[^\"]|\"[^\"]*\")*$)", 2);
}
else {
firstCellAndRest = line.split("[\t,]", 2);
}
cellList.add(firstCellAndRest[0]);
if (firstCellAndRest.length == 1) {
break;
}
line = firstCellAndRest[1];
}
return cellList.toArray(new String[cellList.size()]);
}
#SocialCensus, The example you gave in the comment to MarkusQ, where you throw in ' alongside the ", doesn't work with the example MarkusQ gave right above that if we change sam to sam's: (test, a "comma,", bob, ",sam's,",here) has no match against (,)(?=(?:[^"']|["|'][^"']")$). In fact, the problem itself, "I don't really care about single vs double quotes", is ambiguous. You have to be clear what you mean by quoting either with " or with '. For example, is nesting allowed or not? If so, to how many levels? If only 1 nested level, what happens to a comma outside the inner nested quotation but inside the outer nesting quotation? You should also consider that single quotes happen by themselves as apostrophes (ie, like the counter-example I gave earlier with sam's). Finally, the regex you made doesn't really treat single quotes on par with double quotes since it assumes the last type of quotation mark is necessarily a double quote -- and replacing that last double quote with ['|"] also has a problem if the text doesn't come with correct quoting (or if apostrophes are used), though, I suppose we probably could assume all quotes are correctly delineated.
MarkusQ's regexp answers the question: find all commas that have an even number of double quotes after it (ie, are outside double quotes) and disregard all commas that have an odd number of double quotes after it (ie, are inside double quotes). This is generally the same solution as what you probably want, but let's look at a few anomalies. First, if someone leaves off a quotation mark at the end, then this regexp finds all the wrong commas rather than finding the desired ones or failing to match any. Of course, if a double quote is missing, all bets are off since it might not be clear if the missing one belongs at the end or instead belongs at the beginning; however, there is a case that is legitimate and where the regex could conceivably fail (this is the second "anomaly"). If you adjust the regexp to go across text lines, then you should be aware that quoting multiple consecutive paragraphs requires that you place a single double quote at the beginning of each paragraph and leave out the quote at the end of each paragraph except for at the end of the very last paragraph. This means that over the space of those paragraphs, the regex will fail in some places and succeed in others.
Examples and brief discussions of paragraph quoting and of nested quoting can be found here http://en.wikipedia.org/wiki/Quotation_mark .
I would like to find a regex that will pick out all commas that fall outside quote sets.
For example:
'foo' => 'bar',
'foofoo' => 'bar,bar'
This would pick out the single comma on line 1, after 'bar',
I don't really care about single vs double quotes.
Has anyone got any thoughts? I feel like this should be possible with readaheads, but my regex fu is too weak.
This will match any string up to and including the first non-quoted ",". Is that what you are wanting?
/^([^"]|"[^"]*")*?(,)/
If you want all of them (and as a counter-example to the guy who said it wasn't possible) you could write:
/(,)(?=(?:[^"]|"[^"]*")*$)/
which will match all of them. Thus
'test, a "comma,", bob, ",sam,",here'.gsub(/(,)(?=(?:[^"]|"[^"]*")*$)/,';')
replaces all the commas not inside quotes with semicolons, and produces:
'test; a "comma,"; bob; ",sam,";here'
If you need it to work across line breaks just add the m (multiline) flag.
The below regexes would match all the comma's which are present outside the double quotes,
,(?=(?:[^"]*"[^"]*")*[^"]*$)
DEMO
OR(PCRE only)
"[^"]*"(*SKIP)(*F)|,
"[^"]*" matches all the double quoted block. That is, in this buz,"bar,foo" input, this regex would match "bar,foo" only. Now the following (*SKIP)(*F) makes the match to fail. Then it moves on to the pattern which was next to | symbol and tries to match characters from the remaining string. That is, in our output , next to pattern | will match only the comma which was just after to buz . Note that this won't match the comma which was present inside double quotes, because we already make the double quoted part to skip.
DEMO
The below regex would match all the comma's which are present inside the double quotes,
,(?!(?:[^"]*"[^"]*")*[^"]*$)
DEMO
While it's possible to hack it with a regex (and I enjoy abusing regexes as much as the next guy), you'll get in trouble sooner or later trying to handle substrings without a more advanced parser. Possible ways to get in trouble include mixed quotes, and escaped quotes.
This function will split a string on commas, but not those commas that are within a single- or double-quoted string. It can be easily extended with additional characters to use as quotes (though character pairs like « » would need a few more lines of code) and will even tell you if you forgot to close a quote in your data:
function splitNotStrings(str){
var parse=[], inString=false, escape=0, end=0
for(var i=0, c; c=str[i]; i++){ // looping over the characters in str
if(c==='\\'){ escape^=1; continue} // 1 when odd number of consecutive \
if(c===','){
if(!inString){
parse.push(str.slice(end, i))
end=i+1
}
}
else if(splitNotStrings.quotes.indexOf(c)>-1 && !escape){
if(c===inString) inString=false
else if(!inString) inString=c
}
escape=0
}
// now we finished parsing, strings should be closed
if(inString) throw SyntaxError('expected matching '+inString)
if(end<i) parse.push(str.slice(end, i))
return parse
}
splitNotStrings.quotes="'\"" // add other (symmetrical) quotes here
Try this regular expression:
(?:"(?:[^\\"]+|\\(?:\\\\)*[\\"])*"|'(?:[^\\']+|\\(?:\\\\)*[\\'])*')\s*=>\s*(?:"(?:[^\\"]+|\\(?:\\\\)*[\\"])*"|'(?:[^\\']+|\\(?:\\\\)*[\\'])*')\s*,
This does also allow strings like “'foo\'bar' => 'bar\\',”.
MarkusQ's answer worked great for me for about a year, until it didn't. I just got a stack overflow error on a line with about 120 commas and 3682 characters total. In Java, like this:
String[] cells = line.split("[\t,](?=(?:[^\"]|\"[^\"]*\")*$)", -1);
Here's my extremely inelegant replacement that doesn't stack overflow:
private String[] extractCellsFromLine(String line) {
List<String> cellList = new ArrayList<String>();
while (true) {
String[] firstCellAndRest;
if (line.startsWith("\"")) {
firstCellAndRest = line.split("([\t,])(?=(?:[^\"]|\"[^\"]*\")*$)", 2);
}
else {
firstCellAndRest = line.split("[\t,]", 2);
}
cellList.add(firstCellAndRest[0]);
if (firstCellAndRest.length == 1) {
break;
}
line = firstCellAndRest[1];
}
return cellList.toArray(new String[cellList.size()]);
}
#SocialCensus, The example you gave in the comment to MarkusQ, where you throw in ' alongside the ", doesn't work with the example MarkusQ gave right above that if we change sam to sam's: (test, a "comma,", bob, ",sam's,",here) has no match against (,)(?=(?:[^"']|["|'][^"']")$). In fact, the problem itself, "I don't really care about single vs double quotes", is ambiguous. You have to be clear what you mean by quoting either with " or with '. For example, is nesting allowed or not? If so, to how many levels? If only 1 nested level, what happens to a comma outside the inner nested quotation but inside the outer nesting quotation? You should also consider that single quotes happen by themselves as apostrophes (ie, like the counter-example I gave earlier with sam's). Finally, the regex you made doesn't really treat single quotes on par with double quotes since it assumes the last type of quotation mark is necessarily a double quote -- and replacing that last double quote with ['|"] also has a problem if the text doesn't come with correct quoting (or if apostrophes are used), though, I suppose we probably could assume all quotes are correctly delineated.
MarkusQ's regexp answers the question: find all commas that have an even number of double quotes after it (ie, are outside double quotes) and disregard all commas that have an odd number of double quotes after it (ie, are inside double quotes). This is generally the same solution as what you probably want, but let's look at a few anomalies. First, if someone leaves off a quotation mark at the end, then this regexp finds all the wrong commas rather than finding the desired ones or failing to match any. Of course, if a double quote is missing, all bets are off since it might not be clear if the missing one belongs at the end or instead belongs at the beginning; however, there is a case that is legitimate and where the regex could conceivably fail (this is the second "anomaly"). If you adjust the regexp to go across text lines, then you should be aware that quoting multiple consecutive paragraphs requires that you place a single double quote at the beginning of each paragraph and leave out the quote at the end of each paragraph except for at the end of the very last paragraph. This means that over the space of those paragraphs, the regex will fail in some places and succeed in others.
Examples and brief discussions of paragraph quoting and of nested quoting can be found here http://en.wikipedia.org/wiki/Quotation_mark .
I need to actually print a Dollar sign in Dart, ahead of a variable. For example:
void main()
{
int dollars=42;
print("I have $dollars."); // I have 42.
}
I want the output to be: I have $42. How can I do this? Thanks.
Dart strings can be either raw or ... not raw (normal? cooked? interpreted? There isn't a formal name). I'll go with "interpreted" here, because it describes the problem you have.
In a raw string, "$" and "\" mean nothing special, they are just characters like any other.
In an interpreted string, "$" starts an interpolation and "\" starts an escape.
Since you want the interpolation for "$dollars", you can't use "$" literally, so you need to escape it:
int dollars = 42;
print("I have \$$dollars.");
If you don't want to use an escape, you can combine the string from raw and interpreted parts:
int dollars = 42;
print(r"I have $" "$dollars.");
Two adjacent string literals are combined into one string, even if they are different types of string.
You can use a backslash to escape:
int dollars=42;
print("I have \$$dollars."); // I have $42.
When you are using literals instead of variables you can also use raw strings:
print(r"I have $42."); // I have $42.