Semantic meaning of '36_864_7_345ms' as a time literal - syntax

Reading the spec for verilog, it appears that
36_864_7_345ms
Is a valid time literal: http://www.ece.uah.edu/~gaede/cpe526/SystemVerilog_3.1a.pdf (see section 2)
Note: decimal_digit is defined as [0-9] in the full IEEE spec.
What is the semantic meaning (if any) of this time literal? Or am I misreading the spec?
Edit:
Looking elsewhere in the spec (section 3.7.9), it appears that the underscore characters are silently discarded. Does the underscore act as an arbitrary seperating character in a similar way as numbers in English (ex. 43,251) have commas to visually separate the numbers? Or is there another meaning altogether?

The spec you quoted from is long since obsolete. Please get the latest from the IEEE where it says in section 5.7.1 Integer literal constants:
The underscore character (_) shall be legal anywhere in a number
except as the first character. The underscore character is ignored.
This feature can be used to break up long numbers for readability
purposes.

Related

What are valid date-time separators in RFC3339 strings?

I'm quite confused as to what's allowed as the time separator/designator in the RFC3339 standard. By time separator I mean the sequence of characters that draw the line between date and time.
The standard states in section 5.6 different things that are unclear or conflicting. First of all, it says that the production rule for a full datetime is this:
date-time = full-date "T" full-time
Meaning that the delimiter between the date and the time is an uppercase T. Right after comes this:
NOTE: Per [ABNF] and ISO8601, the "T" and "Z" characters in this
syntax may alternatively be lower case "t" or "z" respectively
Meaning the upper case T may be a lower case t. It conflicts with the ABNF, but OK, it stills sounds to me within the realm of reasonable. Then the following is stated
NOTE: ISO 8601 defines date and time separated by "T".
Applications using this syntax may choose, for the sake of
readability, to specify a full-date and full-time separated by
(say) a space character.
Which is very confusing. Does this allow not only a space character but anything? which is what this say implies. Or does it by this syntax refer to ISO8601 and unnecessarily describes a detail of that other standard?
In other words, are the following valid RFC3339 strings?
2020-09-07 20:26:03.623359300+02:00
2020-09-07hey johnny20:26:03.623359300+02:00
2020-09-07💩20:26:03.623359300+02:00
Meaning the upper case T may be a lower case t. It conflicts with the ABNF, [...]
It does not. See 2.3 Terminal Values of RFC 2234:
Literal text strings are interpreted as a concatenated set of
printable characters.
NOTE: ABNF strings are case-insensitive and
the character set for these strings is us-ascii.
So it is allowed to use t here.
NOTE: ISO 8601 defines date and time separated by "T".
Applications using this syntax may choose, for the sake of
readability, to specify a full-date and full-time separated by
(say) a space character.
Which is very confusing. Does this allow not only a space character
but anything?
This "deviation" is used for readability to the user when displayed. So when the value is displayed to the user in some kind, it can be displayed as:
2020-09-07 20:26:03.623359300+02:00
2020-09-07, 20:26:03.623359300+02:00
That way it might be easier for the user to see the clear space between the date and time, so they don't have to look for the T or t character to find the separation. It is indeed a vague sentence as it basically mean the application can do anything.
To answer your question: These listed date formats are not valid according to RFC 3339.
Short answer: T (or t as discouraged alternative).
After reading on this as much as I could, it turns out the time separator must be a T or t. What has made think this way is first of all this thread in the GNU lists where F. Alexander Njemz contacted the authors of RFC3339 Graham Klyne and Chris Newman asking if T is mandatory and got this response from Mr. Klyne:
In short: "yes"
Per section 5.5, the intent in this draft was to specify a timestamp format using
elements from and compatible with 8601, but eliminating as far as
reasonable any variations that could make timestamp data harder to
process. This includes making the 'T' mandatory in date+time values.
#g
Just for clarity's sake, this is stated in the section 5.5:
Simplicity is achieved by making most fields and punctuation
mandatory.
This clearly clashes with a non-mandatory T and strongly makes me think that the this syntax in that problematic passage refers to ISO8601 and not RFC3339.
For those who want to read more, here are some links regarding the confusion created by this specific point:
https://lists.gnu.org/archive/html/bug-coreutils/2006-05/msg00014.html
http://validator.w3.org/feed/docs/error/InvalidRFC3339Date.html
https://www.rfc-editor.org/errata/eid5783
Plus of course divergent implementations. For instance, the developers of GNU Date chose to use a space character:
$ date --rfc-3339=seconds
2020-09-14 14:53:51+02:00

What are valid identifiers in R7RS-small?

R7RS-small says that all identifiers must be terminated by a delimiter, but at the same time it defines pretty elaborate rules for what can be in an identifier. So, which one is it?
Is an identifier supposed to start with an initial character and then continue until a delimiter, or does it start with an initial character and continue following the syntax defined in 7.1.1.
Here are a couple of obvious cases. Are these valid identifiers?
a#a
b,b
c'c
d[d]
If they are not supposed to be valid, what is the purpose of saying that an identifier must be terminated by a delimiter?
|..ident..| are delimiters for symbols in R7RS, to allow any character that you cannot insert in an old style symbol (| is the delimiter).
However, in R6RS the "official" grammar was incorrect, as it did not allow to define symbols such that 1+, which led all implementations define their own rules to overcome this illness of the official grammar.
Unless you need to read the source code of a given implementation and see how it defines the symbols, you should not care too much about these rules and use classical symbols.
In the section 7.1.1 you find the backus-naur form that defines the lexical structure of R7RS identifiers but I doubt the implementations follow it.
I quote from here
As with identifiers, different implementations of Scheme use slightly
different rules, but it is always the case that a sequence of
characters that contains no special characters and begins with a
character that cannot begin a number is taken to be a symbol
In other words, an implementation will use a function like read-atom and after that it will classify an atom by backtracking with read-number and if number? fails it will be a symbol.

Significance of an ampersand in VB6 function name?

I just got a bunch of legacy VB6 (!) code dumped on me and I keep seeing functions declared with an ampersand at the end of the name, for example, Private Declare Function ShellExecute& . . ..
I've been unable to find an answer to the significance of this, nor have I been able to detect any pattern in use or signature of the functions that have been named thusly.
Anyone know if those trailing ampersands mean anything to the compiler, or at least if there's some convention that I'm missing? So far, I'm writing it off as a strange programmer, but I'd like to know for sure if there's any meaning behind it.
It means that the function returns a Long (i.e. 32-bit integer) value.
It is equivalent to
Declare Function ShellExecute(...) As Long
The full list of suffixes is as follows:
Integer %
Long &
Single !
Double #
Currency #
String $
As Philip Sheard has said it is an indentifier type for a Long. They are still present in .Net, see this MSDN link and this VB6 article
From the second article:
The rules for forming a valid VB variable name are as follows:
(1) The first character must be a letter A through Z (uppercase or
lowercase letters may be used). Succeeding characters can be letters,
digits, or the underscore (_) character (no spaces or other characters
allowed).
(2) The final character can be a "type-declaration character". Only
some of the variable types can use them, as shown below:
Data Type Type Declaration Character
String $
Integer %
Long &
Single !
Double #
Currency #
Use of type-declaration
characters in VB is not encouraged; the modern style is to use the
"As" clause in a data declaration statement.

Allowed characters in map key identifier in YAML?

Which characters are and are not allowed in a key (i.e. example in example: "Value") in YAML?
According to the YAML 1.2 specification simply advises using printable characters with explicit control characters being excluded (see here):
In constructing key names, characters the YAML spec. uses to denote syntax or special meaning need to be avoided (e.g. # denotes comment, > denotes folding, - denotes list, etc.).
Essentially, you are left to the relative coding conventions (restrictions) by whatever code (parser/tool implementation) that needs to consume your YAML document. The more you stick with alphanumerics the better; it has simply been our experience that the underscore has worked with most tooling we have encountered.
It has been a shared practice with others we work with to convert the period character . to an underscore character _ when mapping namespace syntax that uses periods to YAML. Some people have similarly used hyphens successfully, but we have seen it misconstrued in some implementations.
Any character (if properly quoted by either single quotes 'example' or double quotes "example"). Please be aware that the key does not have to be a scalar ('example'). It can be a list or a map.

User-defined Literals suffix, with *_digit..."?

A user-defined literal suffix in C++0x should be an identifier that
starts with _ (underscore) (17.6.4.3.5)
should not begin with _ followed by uppercase letter (17.6.4.3.2)
Each name that [...] begins with an underscore followed by an uppercase letter is reserved to the implementation for any use.
Is there any reason, why such a suffix may not start _ followed by a digit? I.E. _4 or _3musketeers?
Musketeer dartagnan = "d'Artagnan"_3musketeers;
int num = 123123_4; // to be interpreted in base4 system?
string s = "gdDadndJdOhsl2"_64; // base64decoder
The precedent for identifiers of the form _<number> is the function argument placeholder object mechanism in std::placeholders (§20.8.9.1.3), which defines an implementation-defined number of such symbols.
This is a good thing, because it means the user cannot #define any identifier of that form. §17.6.4.3.1/1:
A translation unit that includes a standard library header shall not #define or #undef names declared in any standard library header.
The name of the user-defined literal function is operator "" _123, not simply _123, so there is no direct conflict between your name and the library name if presence of the using namespace std::placeholders;.
My 2¢, though, is that you would be better off with an operator "" _baseconv and encoding the base within the literal, "123123_4"_baseconv.
Edit: Looking at Johannes' (deleted) answer, there is There may be concern that _123 could be used as a macro by the implementation. This is certainly the realm of theory, as the implementation would have little to gain by such preprocessor use. Furthermore, if I'm not mistaken, the reason for hiding these symbols in std::placeholders, not std itself, is that such names are more likely to be used by the user, such as by inclusion of Boost Bind (which does not hide them inside a named namespace).
The tokens are not reserved for use by the implementation globally (17.6.4.3.2), and there is precedent for their use, so they are at least as safe as, say, forward.
"can" vs "may".
can denotes ability where may denotes permission.
Is there a reason why you would not have permission to the start a user-defined literal suffix with _ followed by a digit?
Permission implies coding standards or best-practices. The examples you provides seem to show that _\d would fine suffixes if used correctly (to denote numeric base). Unfortunately your question can't have a well thought out answer as no one has experience with this new language feature yet.
Just to be clear user-defined literal suffixes can start with _\d.
An underscore followed by a digit is a legal user-defined literal suffix.
The function signature would be:
operator"" _4();
so it couldn;t get eaten by a placeholder.
The literal would be a single preprocessor token:
123123_4;
so the _4 would not get clobbered by a placeholder or a preprocessor symbol.
My reading of 17.6.4.3.5 is that suffixes not containing a leading underscore risk collision with the implementation or future library additions. They also collide with existing suffixes: F, L, ULL, etc. One of the rationales for user-defined literals is that a new type (such as decimals for example) could be defined as a pure library extension including literals with suffuxes d, df, dl.
Then there's the question of style and readability. Personally, I think I would loose sight of the suffix 1234_3; Maybe, maybe not.
Finally, there was some idea that didn't make it into the standard (but I kind of like) to have _ be a literal separator for numbers like in Ada and Ruby. So you could have 123_456_789 to visually separate thousands for example. Your suffix would break if that ever went through.
I knew I had some papers on this subject:
Digital Separators describes a proposal to use _ as a digit separator in numeric literals
Ambiguity and Insecurity with User-Defined literals Describes the evolution of ideas about literal suffix naming and namespace reservation and efforts to deconflict user-defined literals against a future digit separator.
It just doesn't look that good for the _ digit separator.
I had an idea though: how about either a backslash or a backtick for digit separator? It isn't as nice as _ but I don't think there would be any collision as long as the backslash was inside the stream of digits. The backtick has no lexical use currently that I know of.
i = 123\456\789;
j = 0xface\beef;
or
i = 123`456`789;
j = 0xface`beef;
This would leave _123 as a literal suffix.

Resources