Dangling topic (or any other) variable does not fail - syntax

This works:
$_ =
say "hi";
That is, you can put any amount of whitespace between an assignment and stuff that's behind, it will simply ignore it. You can use any variable (with my) too. Effectively, $_ will be assigned the result of the say, which is True.
Is this surprising, but up to spec, or simply surprising?

There may be any amount of whitespace either side of an operator. Thus:
say 1
+ 2
+ 3;
Is, so far as the compiler sees it, entirely the same as:
say 1 + 2 + 3;
Assignment (=) is just another operator, so also follows these rules.
Further, say is just a normal built-in subroutine, so it's just like:
my $answer = flip '24';
say $answer; # 42
Except with more whitespace:
my $answer =
flip '24';
say $answer; # 42
There are some places in Perl 6 where whitespace is significant, but the whitespace between infix operators is not one of them.

TL;DR P6 syntax is freeform.
Dangling topic (or any other) variable does not fail
The issue you describe is a very general one. It's most definitely not merely about variable declaration/assignment!
In P6, parsing of a statement -- a single imperative unit ("do this") -- generally just keeps on going until it reaches an explicit statement separator -- ; -- that brings a statement to its end, just like the period (aka full stop) at the end of this English sentence.
you can put any amount of whitespace
Like many programming languages, standard P6 is generally freeform. That is to say, wherever some whitespace is valid, then generally any amount of whitespace -- horizontal and vertical whitespace -- is syntactically equivalent.
$_ =
say "hi";
The above works exactly as would be expected if someone is applying the freeform principle -- it's a single statement assigning the value of the say to the $_ variable.
Is this surprising, but up to spec, or simply surprising?
I like inventing (hopefully pithy) sayings. I just invented "Surprise follows surmise".
It's up to spec. I know to expect it. It doesn't surprise me.
If someone embraces the fact that P6 is generally freeform and has semicolon statement separation then it will, I predict, (eventually -- likely quickly) stop being surprising.
The foregoing is a direct answer to your question. See also Jonathan's answer for more along the same lines. Feel free to ignore the rest of this answer.
For the rest of this answer I use "freeform" to refer to P6's combination of freeform syntax, semicolon statement separation, and braced blocks ({...}).
The rest of this answer is in three sections:
P6 exceptions to freeform syntax
Freeform vs line oriented
Freeform and line oriented?
P6 exceptions to freeform syntax
#Larry concluded that intuition, aesthetics, convenience, and/or other factors justified an exception from a pure freeform syntax in standard P6 in a few cases.
Statements may omit the trailing semicolon if they:
Are the last statement in a source file, or in a block;
End in a block whose closing curly is followed by a newline (ignoring comments).
Thus none of the three statements below (the if and two says) need a closing semicolon:
if 42 {
say 'no semicolon needed before the closing curly'
} # no semicolon needed after the closing curly
say 'no semicolon needed if last statement in source file'
Sometimes this may not be what's wanted:
{ ... } # the closing } ends the statement (block)
.() # call is invoked on $_
One way to change that is to use parentheses:
({ ... })
.() # calls the block on prior line
For some constructs spaces are either required or disallowed. For example, some postfixes must directly follow the values they apply to and some infixes must not. These are both syntax errors:
foo ++
foo«+»bar
Freeform vs line oriented
For some coding scenarios P6's freeform syntax is arguably a strong net positive, eg:
One liners can use blocks;
FP code is natural (one can write non-trivial closures);
More straight-forward editing/refactoring/copying/pasting.
But there are downsides:
The writing and reading overhead of freeform syntax -- semicolons and block braces.
Ignoring the intuition that presumably led you to post your question.
The latter is potent. All of the following could lead someone to think that the say in your example is part of a new statement that isn't a continuation of the $_ = statement:
The newline after the =;
The blank line after that;
The lack of an indent at the start of the say line relative to the $_ = line;
The nature of say (it might seem like say must be the start of a new statement).
An upshot of the above intuitions is that some programming languages adopt a "line-oriented" syntax rather than a freeform one, with the most famous being Python.1
Freeform and line oriented syntax
Some languages, eg Haskell, allow use of either line oriented or freeform syntax (at least for some language constructs).
P6 supports slangs, userland modules that mutate the language. Imagine a slang that supported both freeform and line-oriented code so:
Those learning P6 encountered more familiarity and less surprises as they learned the language's basics by focusing on either line-oriented or freeform code based on their preference;
Those familiar with P6 could write better code by using either line-oriented or freeform syntax.
At the risk of over-complicating things, imagine a slang that adopts not only line orientation but also the off-side rule that Python supports, and implements no strict; for untyped sigil-free variables (which drops declarators and sigils and promotes immutability). Here's a fragment of some code written in said imagined slang that I posted in a reddit comment a few weeks ago:
sub egg(bar)
put bar
data = ["Iron man", "is", "Tony Stark"]
callbacks = []
Perhaps something like the above is too difficult to pull off? (I don't currently see why.)
Footnotes
1 The remainder of this section compares P6 and Python using the Wikipedia section on Programming language statements as our guide:
A statement separator is used to demarcate boundaries between two separate statements.
In P6 it's ; or the end of blocks.
In Python ; is available to separate statements. But it's primarily line-oriented.
Languages that interpret the end of line to be the end of a statement are called "line-oriented" languages.
In Python a line end terminates a statement unless the next line is suitably indented (in which case it's the start of an associated sub-block) or an explicit line continuation character appears at the end of a line.
P6 is not line-oriented. (At least, not standard P6. I'll posit a P6 slang at the end of this answer that supports both freeform and line-oriented syntax.)
"Line continuation" is a convention in line-oriented languages [that] allows a single statement to span more than just one line.
Python has line continuation features; see the Wikipedia article for details.
Standard P6 also has line continuation features despite not being line-oriented.2
2 P6 supports line continuation. Continuing with quotes from Wikipedia:
a newline normally results in a token being added to the token stream, unless line continuation is detected.
(A token is the smallest fragment of code -- beyond an individual character -- that's treated as an atomic unit by the parser.)
Standard P6 always assumes a token break if it encounters a newline with the sole exception of a string written across lines like this:
say
'The
quick
fox';
This will compile OK and display The, quick, and fox on three separate lines.
The equivalent in Python will generate a syntax error.
Backslash as last character of line
In P6 a backslash:
Cannot appear in the middle of a token;
Can be used to introduce whitespace in sourcecode that is ignored by the parser to avoid what would otherwise be a syntax error. For example:
say #foo\
»++
Is actually a more general concept of "unspace" that can be used anywhere within a line not just at the end:
say #foo\ »++
Some form of inline comment serves as line continuation
This works:
say {;}#`( An embedded
comment ).signature
An embedded comment:
Cannot appear in the middle of a token;
Isn't as general as a backslash (say #foo#`[...]»++ doesn't work).

Related

Should I split a ruby single-line if statement into multi-line if statement because the line is long?

I have a ruby single-line statement that is very long, about 200 characters. According to a ruby style guide, single-line if statement is favored here because the body is single-line.
address = Module::InnerModule::Class.new(long_address) if Module::Class.new(long_address).is_good?
But, 200 characters is way over the usual threshold for line length (which is usually 120 at most). Should I split the if statement into a multi-line statement in order to reduce the line length (or should I just accept that the line is long)?
if Module::Class.new(long_address).is_good?
address = Module::InnerModule::Class.new(long_address)
Also, what happens if the line is still very long after splitting? What is the best practice here? I'm new to Ruby, so I would appreciate any advice on the best practice here.
Style questions aside, if you want to maintain your current semantics, you can break lines at certain keywords and operators without escaping newlines with backslashes. For example:
address =
Module::InnerModule::Class.new(long_address) if
Module::Class.new(long_address).is_good?
Otherwise, change your semantics or refactor your code to fit your desired line length and chosen style. Questions about how to split lines are answerable, but the “best” way to split, indent, or refactor are largely subjective, and mostly amount to a combination of readability and intent.

What does the "%" mean in tcl?

In a situation like this for example:
[% $create_port %]
or [list [% $RTL_LIST %]]
I realized it had to do with the brackets, but what confuses me is that sometimes it is used with the brackets and variable followed, and sometimes you have brackets with variables inside without the %.
So i'm not sure what it is used for.
Any help is appreciated.
% is not a metacharacter in the Tcl language core, but it still has a few meanings in Tcl. In particular, it's the modulus operator in expr and a substitution field specifier in format, scan, clock format and clock scan. (It's also the default prompt character, and I have a trivial pass-through % command in my ~/.tclshrc to make cut-n-pasting code easier, but nobody else in the world needs to follow my lead there!)
But the code you have written does not appear to be any of those (because it would be a syntax error in all of the commands I've mentioned). It looks like it is some sort of directive processing scheme (with the special sequences being [% and %], with the brackets) though not one I recognise such as doctools or rivet. Because a program that embeds a Tcl interpreter could do an arbitrary transformation to scripts before executing them, it's extremely difficult to guess what it might really be.

Bash variable concatenation

Which form is most efficient?
1)
v=''
v+='a'
v+='b'
v+='c'
2)
v2='a'` `'b'` `'c'
Assuming readability were exactly the same to you, and that's a stretch, would 1) mean creating and throwing away a few string immutables (like in Python) or act as a Java "StringBuffer" with periodical expansion of the buffer capacity? How are string concatenations handled internally in Bash?
If 2) were just as readable to you as 1), would the backticks spawn subshells and would that be more costly, even as a potential 'no-op' than what is done in 1) ?
Well, the simplest and most efficient mechanism would be option 0:
v="abc"
The first mechanism involves four assignments.
The second mechanism is bizarre (and is definitely not readable). It (nominally) runs an empty command in two sub-shells (the two ` ` parts) and concatenates the outputs (an empty string) with the three constants. If the shell simply executes the back-tick commands without noting that they're empty (and it's not unreasonable that it won't notice; it is a weird thing to try — I don't recall seeing it done in my previous 30 years of shell scripting), this is definitely vastly slower.
So, given only options (1) and (2), use option (1), but in general, use option (0) shown above.
Why would you be building up the string piecemeal like that? What's missing from your example that makes the original code sensible but the reduced code shown less sensible.
v=""
x=$(...)
v="$v$x"
y=$(...)
v="$v$y"
z=$(...)
v="$v$z"
This would make more sense, especially if you use each of $x, $y and $z later, and/or use intermediate values of $v (perhaps in the commands represented by triple dots). The concatenation notation used will work with any Bourne-shell derivative; the alternative += shell will work with fewer shells, but is probably slightly more efficient (with the emphasis on 'slightly').
The portable and straight forward method would be to use double quotes and curly brackets for variables:
VARA="beginning text ${VARB} middle text ${VARC}..."
you can even set default values for empty variables this way
VARA="${VARB:-default text} substring manipulation 1st 3 characters ${VARC:0:3}"
using the curly brackets prevents situations where there is a $VARa and you want to write ${VAR}a but end up getting the contents of ${VARa}

History of trailing comma in programming language grammars

Many programming languages allow trailing commas in their grammar following the last item in a list. Supposedly this was done to simplify automatic code generation, which is understandable.
As an example, the following is a perfectly legal array initialization in Java (JLS 10.6 Array Initializers):
int[] a = { 1, 2, 3, };
I'm curious if anyone knows which language was first to allow trailing commas such as these. Apparently C had it as far back as 1985.
Also, if anybody knows other grammar "peculiarities" of modern programming languages, I'd be very interested in hearing about those also. I read that Perl and Python for example are even more liberal in allowing trailing commas in other parts of their grammar.
I'm not an expert on the commas, but I know that standard Pascal was very persnickity about semi-colons being statement separators, not terminators. That meant you had to be very very careful about where you put one if you didn't want to get yelled at by the compiler.
Later Pascal-esque languages (C, Modula-2, Ada, etc.) had their standards written to accept the odd extra semicolon without behaving like you'd just peed in the cake mix.
I just found out that a g77 Fortran compiler has the -fugly-comma Ugly Null Arguments flag, though it's a bit different (and as the name implies, rather ugly).
The -fugly-comma option enables use of a single trailing comma to mean “pass an extra trailing null argument” in a list of actual arguments to an external procedure, and use of an empty list of arguments to such a procedure to mean “pass a single null argument”.
For example, CALL FOO(,) means “pass two null arguments”, rather than “pass one null argument”. Also, CALL BAR() means “pass one null argument”.
I'm not sure which version of the language this first appeared in, though.
[Does anybody know] other grammar "peculiarities" of modern programming languages?
One of my favorites, Modula-3, was designed in 1990 with Niklaus Wirth's blessing as the then-latest language in the "Pascal family". Does anyone else remember those awful fights about where semicolon should be a separator or a terminator? In Modula-3, the choice is yours! The EBNF for a sequence of statements is
stmt ::= BEGIN [stmt {; stmt} [;]] END
Similarly, when writing alternatives in a CASE statement, Modula-3 let you use the vertical bar | as either a separator or a prefix. So you could write
CASE c OF
| 'a', 'e', 'i', 'o', 'u' => RETURN Char.Vowel
| 'y' => RETURN Char.Semivowel
ELSE RETURN Char.Consonant
END
or you could leave off the initial bar, perhaps because you prefer to write OF in that position.
I think what I liked as much as the design itself was the designers' awareness that there was a religious war going on and their persistence in finding a way to support both sides.
Let the programmer choose!
P.S. Objective Caml allows permissive use of | in case expressions whereas the earlier and closely related dialect Standard ML does not. As a result, case expressions are often uglier in Standard ML code.
EDIT: After seeing T.E.D.'s answer I checked the Modula-2 grammar and he's correct, Modula-2 also supported semicolon as terminator, but through the device of the empty statement, which makes stuff like
x := x + 1;;;;;; RETURN x
legal. I suppose that's not a bad thing. Modula-2 didn't allow flexible use of the case separator |, however; that seems to have originated with Modula-3.
Something which has always galled me about C is that although it allows an extra trailing comma in an intializer list, it does not allow an extra trailing comma in an enumerator list (for defining the literals of an enumeration type). This little inconsistency has bitten me in the ass more times than I care to admit. And for no reason!

Fortran: line to long / append line - but with text at the end?

I have a line of Fortran code, which includes some text. I'm changing the text, which makes the code line too long for Fortran, so I split it over two lines using 'a'.
Was:
IF (MYVAR .EQ. 1) THEN
WRITE(iott,'(A) (A)') 'ABC=', SOMEVAR
Changed to:
IF (MYVAR .EQ. 1) THEN
WRITE(iott,'(A) (A)') 'ABC DEF GHI JK
a ' // 'L=', SOMEVAR
My question is, on the new line (starting with 'a'), does the white space between the 'a' and the first ' get appended to the string? Or do I need the ' to be the char next to a to prevent additional white space?
As you can tell, I'm not used to Fortran...
If you're worried about exceeding a 72 column limit, then I assume you're using Fortran 77. The syntax for Fortran 77 requires that you start with column 7, except for continued lines, which need a continuation character in column 6. I use the following method to tell me how many lines are continued for one statement (the first line is just to show the columns):
!234567890
write(*,*)"Lorem Ipsum",
1 " Foo",
2 " Bar"
This would print:
Lorem Ipsum Foo Bar
You don't have to worry about spaces that aren't in quotes. All whitespace gets compressed in Fortran, anyway.
It's worthwhile learning how to use format statements. They can make output a lot easier. It's somewhat similar to printf statements, if you're coming from C. You specify a format with different types of parameters, then give variables or literals to fill out that format.
And don't worry that you're not working with the hot, new, language of the day. You can learn a lot from Fortran, even Fortran 77, and when used properly, Fortran can even be elegant. I've seen Fortran 77 written as if it were an object oriented language, complete with dynamic memory. I like to say, "old.ne.bad".
It's been too long for me to remember the old column requirements of FORTRAN (and they may not even be as strict as they were way back when).
But - isn't this something that a quick test run will tell you straight off?
Yes, the a is a continuation character and basically it just means append the rest of this line starting after the continuation character (col 6, right?) to the previous line.
Your Fortran compiler probably has an option to turn on "free form" input instead of using "fixed form" input. Use this and you won't have to worry about line length.
If your Fortran compiler is older than F90 -- which is when I think the free form input ability started, you have my condolences.
#Mike B:
In an ideal world yes, but in this case the code is developed on one machine, and submitted to a build server which has the appropriate 3rd party software / SDK's / licenses available to it to build. The build isn't exactly quick either.

Resources