^ meaning in "let's build a compiler" code

^ meaning in "let's build a compiler" code - pascal

In Jack Crenshaw's "Lets' build a compiler" what does the ^I mean in this statement:
const TAB = ^I;
He also uses ^G in one his functions.

From the Free Pascal Language Reference:
Also, the caret character ( ^ ) can be used in combination with a
letter to specify a character with ASCII value less than 27. Thus ^G
equals #7 - G is the seventh letter in the alphabet.
The compiler is rather sloppy about the characters it allows after the caret, but in general
one should assume only letters.
The result is a one-byte ASCII character constant. I is the 9th letter in the alphabet. And the ASCII value 9 is – no surprise – the TAB character.

It's Control-I. This translates to ASCII char-9, which is the character for Tab. Similarly Ctrl-G is ASCII char-7, which is the character for the BEL (literally bell), which usually generates a beep from a console.

Related

Why is there no super or subscript "q" or "Q" characters defined in UTF-8?

The header says it all. I just would like to be able to write,
inline but I am stuck with eᵖᐟᵠ. Is there any way at all to achieve this simple thing?

There are superscript q and Q characters defined in Unicode version 14 (column CodePoint contains Unicode (U+hhhh) and UTF-8 bytes; column Description contains surrogates in parentheses):
Char CodePoint Description
---- --------- -----------
ꟴ {U+A7F4, 0xEA,0x9F,0xB4} MODIFIER LETTER CAPITAL Q
𐞥 {U+107A5, 0xF0,0x90,0x9E,0xA5} MODIFIER LETTER SMALL Q (0xd801,0xdfa5)
Appear in UnicodeData.txt as follows (the file was previously downloaded from there):
findstr "Q;Lm;" D:\Utils\CodePages\UnicodeData.txt
A7F4;MODIFIER LETTER CAPITAL Q;Lm;0;L;<super> 0051;;;;N;;;;;
107A5;MODIFIER LETTER SMALL Q;Lm;0;L;<super> 0071;;;;N;;;;;
You need to find a font containing glyph for Modifier Letter Small Q (maybe Last Resort font family?) Try my answer to another question How to determine if a Glyph can be displayed?

What exactly expands the KERN_INFO, and where it is implemented?

In this question: Why doesn't the function printk() use a comma to separate parameters?, someone said KERN_INFO expands to ""\001" "6". I know the first \0 is null character, but then what 01 is? As I suppose to be one in octal. When preprocessor concatenate it together to "\0016", the rest after null is 016, which is 14 in decimal. So I have look up in ascii and found it as 0E SO (shift out)? That doesn't make sense to me and it should have something to do with logging (as it is purpose of printk). So what is the meaning of the KERN_INFO macro sequences after expansion?
Also, I have tried to look in source, in /usr/include/linux/kernel.h, but didn't find there the macro. So is it in kernel.h or somewhere else?

"\001" "6" is two string literals that will be concatenated (with any other adjacent string literals) into a single string literal. (The concatenation is done at translation phase 6 as defined in the C standard.)
The first of those string literals, "\001" contains a single octal escape sequence, defining a single character. An octal escape sequence in a string literal or a character constant consists of the backslash (\) followed by from 1 to 3 octal digits (001 in this case). In this case, the single character has numeric code 1, which corresponds to the ASCII SOH (start of heading) character.
The string literal "\0016" contains sequences for two characters '\001' and '6', because an octal escape sequence is always terminated after at most 3 octal digits.
Escape sequences do not cross the boundary between adjacent string literals. (Escape sequences are expanded at translation phase 3, so are already expanded before adjacent string literals are concatenated at translation phase 6). Therefore, the pair of string literals "\1" "6" is equivalent (after concatenation) to the single string literal "\0016", not "\16".
As mentioned by #Peter L., the KERN_INFO macro and other "kernel level" macros are defined in "include/linux/kern_levels.h" in the Linux kernel source. Actually, that is true since kernel version 3.6. Before kernel version 3.6, they were defined in "include/linux/printk.h" and used a different string format with the kernel level number specified between angle brackets (for example KERN_INFO used to be defined as "<6>").
The purpose of these kernel level macros is to prefix the format string parameter of the printk function with special codes to designate the log-level to use for the message written to the kernel log (apart from KERN_CONT which specifies that the message is to be appended to the previous message).

Space characters inside the ESC[ (0x1b/0x5b) sequence invalidate the sequence?

I can not find information about what should be done to spaces within ESC sequence. Example: position cursor
ESC[10;20H
is a valid ESC sequence, but is the one including spaces like
ESC[ 10; 20H
valid too? The point is that while ESC character is a control character with code 0x1b, text following it is human and machine readable text, and in general spaces should not harm the meaning of ESC sequence, thus I would just remove all the spaces found within ESC sequence.
Lots of article on the internet talking what ESC sequence is and what they may consist of (however there're only just few good and really informative ones), but none of them clarify this matter.
I found this one, and it says
Since ASCII control functions do not follow a structured syntax, the notation used to describe function sequences and parameters is important to avoid confusion. Escape sequences are shown with a space between each character to make them easier to read. These spaces are not part of the Escape sequence.
While it says that space char separates characters for readability, they do not say if keeping space invalidates the ESC sequence.
Is there any related RFC for it? I hope it unambiguously defines this case.
Update: thanks Thomas to pointing to space char being one of the ESC sequence operators. So now it is clear that [ should follow ESC character, and space is not allowed between them.
But what is about following arguments? As in the example above, spaces in row and column coordinates ESC[SP10;SP20H makes sequence invalid and I must stop processing it starting displaying space character instead?
Update1: I did small test using Windows telnet application. Logged into the remote server, and that server responds with ESC sequence. The result is:
ESC[2;5H positions properly row 2 column 5
ESC[ 2; 5H displays "2; 5H" in current cursor position
ESC[2 ; 5H displays "; 5H" in current cursor position
So basing on the empirical findings I suspect spaces are NOT allowed, and space char invalidates/cancels the sequence.

ECMA-48's the place to look (if you want an RFC). Look for mention of 02/00 (the way it represents the hexadecimal 0x20 for space).
For what it's worth, there are DEC control sequences (VT220 and up) with an embedded space, e.g., the ones marked with SP:
Controls beginning with ESC
This excludes controls where ESC is part of a 7-bit equivalent to 8-bit
C1 controls, ordered by the final character(s).
ESC SP F 7-bit controls (S7C1T), VT220.
ESC SP G 8-bit controls (S8C1T), VT220.
ESC SP L Set ANSI conformance level 1 (dpANS X3.134.1).
ESC SP M Set ANSI conformance level 2 (dpANS X3.134.1).
ESC SP N Set ANSI conformance level 3 (dpANS X3.134.1).
In your examples, the whitespace is just for readability, and non-printing characters such as ASCII escape (decimal 27, hexadecimal 01xb) are shown by a name such as ESC.

code128 barcode with tilde and asterisk

I am maintaining a printing program that now requires printing both a ~ and an * in a code128 barcode in zpl.
Currently, I am using the code below that uses the ^FH to represent the tilde in hex:
^BCN,120,Y,N,N,N^FH^FDSPECIAL*MAKE_7e123456^FS
The barcode prints excluding the * and ~ as 'SPECIALMAKE123456'. Is it possible to print the tilde and asterisk in a zpl code128 barcode?

As a quick guess, since I don't have a ZPLII printer immediately available, I'd try
^BCN,120,Y,N,N,A^FH^FDSPECIAL*MAKE_7e123456^FS
(note A before the ^FH = Auto-select codeset)
Perhaps also forcing a codeset by ...^FH^FD>:SPECIAL*... may work, but subset B is the default in any case...
I located my old A300 printer, and was able to produce the required interpretation line using each of
^BCN,120,Y,N,N,A^FH^FDSPECIAL*MAKE_7E123456^FS
^BCN,120,Y,N,N,A^FH^FDSPECIAL_2AMAKE_7E123456^FS
Can't find my scanner to verify at present - but the computer room is a mite tidier...

It may depends on type of barcode.
For example, to print in 'barcode 128', you have to change code to code B, by signs >:
And: to print tilde ~, type >=. To print ^, type ><. To print >, type >0.
Look to zpl documentation, to table with Code 128 Invocation Characters.
My sample zpl code:
^XA
^BY2,3,95^FT0,206^BCN,,Y,N
^FD>:caret >< bigger >0 tilde >= end^FS
^PQ1,1,1,Y^XZ

Shell # inside (( ))

I am new to shell. I am not quite understand the following function. This function basically increase the hour by 1.
I am wondering why the developer put "10#" in front of $g_current_hour+1. From my understanding, dose # in shell means comments?
f_increment_hour() {
g_next_hour=$((10#$g_current_hour+1))
}

Everything depends on the context. Here 10# means base 10.
Constants with a leading 0 are interpreted as octal numbers. A
leading 0x or 0X denotes hexadecimal. Otherwise, numbers take the
form [base#]n, where the optional base is a decimal number between
2 and 64 representing the arithmetic base, and n is a number in that
base. If base# is omitted, then base 10 is used.

'#' will be interpreted as part of a token unless it is preceded by a space, newline, or semi-colon.
(or any other non-word symbol)
Section 2.3 "Token recognition" of the language spec, states:
7. If the current character is an unquoted <newline>, the current
token shall be delimited.
8. If the current character is an unquoted <blank>, any token
containing the previous character is delimited and the current
character shall be discarded.
9. If the previous character was part of a word, the current character
shall be appended to that word.
10. If the current character is a '#' , it and all subsequent characters
up to, but excluding, the next <newline> shall be discarded as
a comment. The <newline> that ends the line is not considered
part of the comment.
When the shell is parsing its input and reads "foo#bar", as it is processing the '#' character it applies rule 9 and appends the # to the token. Once rule 9 is applied, it stops checking and rule 10 is never considered. If the character preceding the '#' is whitespace, then rule 9 does not apply, so rule 10 is checked and a comment is started.
In other words, a '#' only starts a comment if the character preceded it is not part of a word ( eg whitespace or semi-colon), so "foo#bar" is one token, and not "foo" followed by a comment, but "foo #bar" is the token "foo" followed by a comment.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio