What exactly expands the KERN_INFO, and where it is implemented? - linux-kernel

In this question: Why doesn't the function printk() use a comma to separate parameters?, someone said KERN_INFO expands to ""\001" "6". I know the first \0 is null character, but then what 01 is? As I suppose to be one in octal. When preprocessor concatenate it together to "\0016", the rest after null is 016, which is 14 in decimal. So I have look up in ascii and found it as 0E SO (shift out)? That doesn't make sense to me and it should have something to do with logging (as it is purpose of printk). So what is the meaning of the KERN_INFO macro sequences after expansion?
Also, I have tried to look in source, in /usr/include/linux/kernel.h, but didn't find there the macro. So is it in kernel.h or somewhere else?

"\001" "6" is two string literals that will be concatenated (with any other adjacent string literals) into a single string literal. (The concatenation is done at translation phase 6 as defined in the C standard.)
The first of those string literals, "\001" contains a single octal escape sequence, defining a single character. An octal escape sequence in a string literal or a character constant consists of the backslash (\) followed by from 1 to 3 octal digits (001 in this case). In this case, the single character has numeric code 1, which corresponds to the ASCII SOH (start of heading) character.
The string literal "\0016" contains sequences for two characters '\001' and '6', because an octal escape sequence is always terminated after at most 3 octal digits.
Escape sequences do not cross the boundary between adjacent string literals. (Escape sequences are expanded at translation phase 3, so are already expanded before adjacent string literals are concatenated at translation phase 6). Therefore, the pair of string literals "\1" "6" is equivalent (after concatenation) to the single string literal "\0016", not "\16".
As mentioned by #Peter L., the KERN_INFO macro and other "kernel level" macros are defined in "include/linux/kern_levels.h" in the Linux kernel source. Actually, that is true since kernel version 3.6. Before kernel version 3.6, they were defined in "include/linux/printk.h" and used a different string format with the kernel level number specified between angle brackets (for example KERN_INFO used to be defined as "<6>").
The purpose of these kernel level macros is to prefix the format string parameter of the printk function with special codes to designate the log-level to use for the message written to the kernel log (apart from KERN_CONT which specifies that the message is to be appended to the previous message).

Related

linux kernel image strings extraction

I'm trying to extract the strings from a binary linux kernel image
(this specific phenomena happens in all types of images I've tried: bzImage, vmlinuz, vmlinux, .... and not a specific one)
Simply running 'strings ' prints many strings with a prefix character, for example:
"4netlink: %d bytes leftover after parsing attributes in process `%s'."
However, looking at the kernel sources, the current string should not include the "4" prefix.
While opening the file using some HEX editor, I've seen that the string actually also includes:
'\x00\x01' and only then '\x34' ("4")
My guess is this is some kind of pointer to a special section, or something of the sorts,
because many other strings include "3" and other numbers (and even characters).
Would appreciate any information in the matter
Thanks!
The prefixes OP is seeing are KERN_<LEVEL> prefixes. These are special string literals to be added before the main printk format specifier, using C's concatenation of adjacent string literals. For example:
printk(KERN_ERR "Something has gone wrong!\n");
From kernel version 3.6 onwards, these KERN_<LEVEL> prefix macros are defined in "include/linux/kern_levels.h" and begin with the ASCII SOH character "\001" followed by the log level as an ASCII digit for the numeric levels, or some other ASCII character for special meanings. The string for KERN_DEFAULT changed from "\001" "d" to "" (empty string) in kernel version 5.1. The string for KERN_CONT changed from "" (empty string) to "\001" "c" in kernel version 4.9.
From kernel version 2.6.37 to 3.5.x, the KERN_<LEVEL> prefix macros were defined in "include/linux/printk.h" and used a different format with the level specified between angle brackets, for example KERN_WARNING was defined as "<4>", KERN_DEFAULT was defined as "<d>", and KERN_CONT was defined as "<c>".
Besides printk, there are other macros for generating kernel logs, some of which specify the KERN_<LEVEL> part implicitly. OP's example from "lib/nlattr.c":
pr_warn_ratelimited("netlink: %d bytes leftover after parsing attributes in process `%s'.\n",
rem, current->comm);
Here, the pr_warn_ratelimited macro is defined in "include/linux/printk.h" as:
#define pr_warn_ratelimited(fmt, ...) \
printk_ratelimited(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
There is a lot going on there, but pr_fmt(fmt) is one or more string literals including fmt macro parameter, so the string passed to printk_ratelimited is constructed from some concatenated string literals beginning with those from the expansion of KERN_WARNING.

Semantic meaning of '36_864_7_345ms' as a time literal

Reading the spec for verilog, it appears that
36_864_7_345ms
Is a valid time literal: http://www.ece.uah.edu/~gaede/cpe526/SystemVerilog_3.1a.pdf (see section 2)
Note: decimal_digit is defined as [0-9] in the full IEEE spec.
What is the semantic meaning (if any) of this time literal? Or am I misreading the spec?
Edit:
Looking elsewhere in the spec (section 3.7.9), it appears that the underscore characters are silently discarded. Does the underscore act as an arbitrary seperating character in a similar way as numbers in English (ex. 43,251) have commas to visually separate the numbers? Or is there another meaning altogether?
The spec you quoted from is long since obsolete. Please get the latest from the IEEE where it says in section 5.7.1 Integer literal constants:
The underscore character (_) shall be legal anywhere in a number
except as the first character. The underscore character is ignored.
This feature can be used to break up long numbers for readability
purposes.

How can I add hair space ( ) to gsub string?

I have the following line in a plugin to display page views on my Jekyll site:
html = pv.to_s.reverse.gsub(/...(?=.)/,'\& ').reverse
It adds space between thousands, for example 23 678.
How can I add hair space   instead of regular space in this string?
In HTML   is a so-called decimal numeric character reference:
The ampersand must be followed by a "#" (U+0023) character, followed by one or more ASCII digits, representing a base-ten integer that corresponds to a Unicode code point that is allowed according to the definition below. The digits must then be followed by a ";" (U+003B) character.
Ruby has the \u escape sequence. However it expects the following characters to represent a hexadecimal (base-sixteen) integer. That's 200A. You also have to use a double-quoted string literal which means now the \ character needs to be escaped with another one:
"\\&\u200A"
Alternatively just use it directly:
'\& '

Shell # inside (( ))

I am new to shell. I am not quite understand the following function. This function basically increase the hour by 1.
I am wondering why the developer put "10#" in front of $g_current_hour+1. From my understanding, dose # in shell means comments?
f_increment_hour() {
g_next_hour=$((10#$g_current_hour+1))
}
Everything depends on the context. Here 10# means base 10.
Constants with a leading 0 are interpreted as octal numbers. A
leading 0x or 0X denotes hexadecimal. Otherwise, numbers take the
form [base#]n, where the optional base is a decimal number between
2 and 64 representing the arithmetic base, and n is a number in that
base. If base# is omitted, then base 10 is used.
'#' will be interpreted as part of a token unless it is preceded by a space, newline, or semi-colon.
(or any other non-word symbol)
Section 2.3 "Token recognition" of the language spec, states:
7. If the current character is an unquoted <newline>, the current
token shall be delimited.
8. If the current character is an unquoted <blank>, any token
containing the previous character is delimited and the current
character shall be discarded.
9. If the previous character was part of a word, the current character
shall be appended to that word.
10. If the current character is a '#' , it and all subsequent characters
up to, but excluding, the next <newline> shall be discarded as
a comment. The <newline> that ends the line is not considered
part of the comment.
When the shell is parsing its input and reads "foo#bar", as it is processing the '#' character it applies rule 9 and appends the # to the token. Once rule 9 is applied, it stops checking and rule 10 is never considered. If the character preceding the '#' is whitespace, then rule 9 does not apply, so rule 10 is checked and a comment is started.
In other words, a '#' only starts a comment if the character preceded it is not part of a word ( eg whitespace or semi-colon), so "foo#bar" is one token, and not "foo" followed by a comment, but "foo #bar" is the token "foo" followed by a comment.

Double Quotes in ASCII

What is the ASCII number for the double quote? (")
Also, is there a link to a list anywhere?
Finally, how do you enter it in the C family (esp. C#)
The ASCII code for the quotation mark is 34.
There are plenty of ASCII tables on the web. Note that some describe the standard 7-bit ASCII code, while others describe various 8-bit extensions that are super-sets of ASCII.
To put quotation marks in a string, you escape it using a backslash:
string msg = "Let's just call it a \"duck\" and be done with it.";
To put a quotation mark in a character literal, you don't need to escape it:
char quotationMark = '"';
Note: Strings and characters in C# are not ASCII, they are Unicode. As Unicode is a superset of ASCII the codes are still usable though. You would need a Unicode character table to look up some characters, but an ASCII table works fine for the most common characters.
It's 34. And you can find a list on Wikipedia.
yes, the answer the 34
In order to find the ascii value for special character and other alpha character I'm writing here small vbscript. In a note pad, write the below script save as abc.vbs(any name with extention .vbs) double click on the file to execute and you can see double quotes for 34.
For i=0 to 150
msgbox i&"="&char(i)
next
you are never going to need only 1 quote, right?
so I declare a CHAR variable i.e
char DoubleQuote;
then drop in a double quote
Convert.ToChar(34);
so use the variable DoubleQuote where you need it
works in SQL to generate dynamic SQL but there you need
DECLARE #SingleQuote CHAR(1)
and
SET #SingleQuote=CHAR(39)

Resources