I have the following line in a plugin to display page views on my Jekyll site:
html = pv.to_s.reverse.gsub(/...(?=.)/,'\& ').reverse
It adds space between thousands, for example 23 678.
How can I add hair space instead of regular space in this string?
In HTML is a so-called decimal numeric character reference:
The ampersand must be followed by a "#" (U+0023) character, followed by one or more ASCII digits, representing a base-ten integer that corresponds to a Unicode code point that is allowed according to the definition below. The digits must then be followed by a ";" (U+003B) character.
Ruby has the \u escape sequence. However it expects the following characters to represent a hexadecimal (base-sixteen) integer. That's 200A. You also have to use a double-quoted string literal which means now the \ character needs to be escaped with another one:
"\\&\u200A"
Alternatively just use it directly:
'\& '
Related
I’m printing a parameter returned from a query that’s a string of letters and underscores.
The label prints just the letters without the underscores, and I’m not sure how to fix it.
^FD<String>^FS
^FH^FD<String>^FS
Thank you very much.
(Removing the FH Only reads to the first underscore.
The ^FH command without parameter defaults to underscore as the hexidecimal escape character. Either remove the ^FH or specify a different escape character like backslash using ^FH\^FD<String>^FS.
In this question: Why doesn't the function printk() use a comma to separate parameters?, someone said KERN_INFO expands to ""\001" "6". I know the first \0 is null character, but then what 01 is? As I suppose to be one in octal. When preprocessor concatenate it together to "\0016", the rest after null is 016, which is 14 in decimal. So I have look up in ascii and found it as 0E SO (shift out)? That doesn't make sense to me and it should have something to do with logging (as it is purpose of printk). So what is the meaning of the KERN_INFO macro sequences after expansion?
Also, I have tried to look in source, in /usr/include/linux/kernel.h, but didn't find there the macro. So is it in kernel.h or somewhere else?
"\001" "6" is two string literals that will be concatenated (with any other adjacent string literals) into a single string literal. (The concatenation is done at translation phase 6 as defined in the C standard.)
The first of those string literals, "\001" contains a single octal escape sequence, defining a single character. An octal escape sequence in a string literal or a character constant consists of the backslash (\) followed by from 1 to 3 octal digits (001 in this case). In this case, the single character has numeric code 1, which corresponds to the ASCII SOH (start of heading) character.
The string literal "\0016" contains sequences for two characters '\001' and '6', because an octal escape sequence is always terminated after at most 3 octal digits.
Escape sequences do not cross the boundary between adjacent string literals. (Escape sequences are expanded at translation phase 3, so are already expanded before adjacent string literals are concatenated at translation phase 6). Therefore, the pair of string literals "\1" "6" is equivalent (after concatenation) to the single string literal "\0016", not "\16".
As mentioned by #Peter L., the KERN_INFO macro and other "kernel level" macros are defined in "include/linux/kern_levels.h" in the Linux kernel source. Actually, that is true since kernel version 3.6. Before kernel version 3.6, they were defined in "include/linux/printk.h" and used a different string format with the kernel level number specified between angle brackets (for example KERN_INFO used to be defined as "<6>").
The purpose of these kernel level macros is to prefix the format string parameter of the printk function with special codes to designate the log-level to use for the message written to the kernel log (apart from KERN_CONT which specifies that the message is to be appended to the previous message).
I am new to SF. I have a typical problem that I have faced while loading some data. The delimiter is part of extended ascii. It does not come in 0-127. We use thorn (ascii - 254) as delimiter. My Qn is while specifying the delimiter can I give the ascii code of that delimiter instead of actual character (44 instead of comma, 9 instead of tab etc)
Thanks in advance
You can specify the hex/octal code of any valid Unicode delimiter in the FIELD_DELIMITER option of the File Format. From the documentation:
The specified delimiter must be a valid UTF-8 character and not a random sequence of bytes.
For example, for fields delimited by the thorn (Þ) character, specify the octal (\336) or hex (0xDE) value. Also accepts a value of NONE.
I have an html file that I need to replace some characters with html entities. Right now I'm trying to replace — with — but when I use the Replace All button, the result is that all of those instances of — are replaced with —mdash;
I thought maybe escaping the "&" will work, so I changed the Replace with value to \— but that just results in \—mdash;
The strange thing is that if I go to each, one by one, i.e., click Next, then click Replace, and so on, then it replaces it correctly.
Is this a bug in MacVim? Or am I missing something?
Enter into command line:
:%s/—/\—/g
Also it's possible to get character code. Place your cursor on the character and press ga. Use decimal, hex or octal code into replacement string:
\%d match specified decimal character
\%x match specified hex character
\%o match specified octal character
\%u match specified multibyte character
\%U match specified large multibyte character
:%s/\%d8212/\$mdash;/g
I have the following string "\u3048\u3075\u3057\u3093". I got the string
from a web page as part of returned data in JSONP.
What is that? It looks like UTF8, but then should it look like "U+3048U+3075U+3057U+3093"?
What's the meaning of the backslashes (\)?
How can I convert it to a human-readable form?
I'm looking to a solution with Ruby, but any explanation of what's going on here is appreciated.
The U+3048 syntax is normally used to represent the Unicode code point of a character. Such code point is fixed and does not depend on the encoding (UTF-8, UTF-32...).
A JSON string is composed of Unicode characters except double quote, backslash and those in the U+0000 to U+001F range (control characters). Characters can be represented with a escape sequence starting with \u and followed by 4 hexadecimal digits that represent the Unicode code point of the character. This is the JavaScript syntax (JSON is a subset of it). In JavaScript, the backslash is used as escape char.
It is Unicode, but not in UTF-8, it is in UTF-16. You might ignore surrogate pairs and deem it as 4-digit hexadecimal code points of a Unicode code character.
Using Ruby 1.9:
require 'json'
puts JSON.parse("[\"\\u4e00\",\"\\u4e8c\"]")
Prints:
一
二
Unicode characters in JSON are escaped as backslash u followed by four hex digits. See the string production on json.org.
Any JSON parser will convert it to the correct representation for your platform (if it doesn't, then by definition it is not a JSON parser)