What's the correct way to include commas in Windows' day, month, year format strings? - winapi

MSDN says this about Windows' date format strings as returned by LOCALE_SLONGDATE:
If spaces are used to separate the elements in the string, these spaces will appear in the same location in the output string. For example, to get the date string "Wed, Aug 31 94", the application uses the picture string "ddd',' MMM dd yy". The application uses single quotation marks to mark characters to display exactly as specified.
To me, this sounds like all characters except spaces must be escaped using single quotation marks. As shown above, the comma after Wed is escaped as ',' in the format string. However, when I query LOCALE_SLONGDATE using GetLocaleInfo() I get the following format string on my Windows system using the default locale (German):
dddd, d. MMMM yyyy
In contrast to the example on MSDN from above, there are no quotes around the comma here. It's completely unescaped, just like the space characters.
So, is that documentation on MSDN just very outdated (cough August 1994), or what is the correct way to format those date strings?

Related

printing text without use of format string in bash

Have been doing printf "text" to print some text from a bash script. Is using printf without a format string valid to do?
Putting the entire message in the format string is a reasonable thing to do provided it doesn't contain any dynamic data. As long as you have full control over the string (i.e. it's either just a fixed string, or one selected from a set of fixed strings, or something like that), and you've used that control to make sure it doesn't contain any unintended escape characters, all % characters in it are doubled (making them literal, rather than format specifiers), and the string doesn't start with -.
Basically, if it's a fixed string and it doesn't obviously fail, it'll work consistently.
But if it contains any sort of dynamic data -- filenames, user-entered data, anything at all like that -- you should put format specifiers in the format string, and the dynamic data in separate arguments.
So these are ok:
printf 'Help, Help, the Globolinks!\n'
printf 'Help, Help, the %s!\n' "$monster_name"
But this is not:
printf "Help, Help, the $monster_name!\n" # Don't do this

Creation of file-format in Snow-flake

I am new to SF. I have a typical problem that I have faced while loading some data. The delimiter is part of extended ascii. It does not come in 0-127. We use thorn (ascii - 254) as delimiter. My Qn is while specifying the delimiter can I give the ascii code of that delimiter instead of actual character (44 instead of comma, 9 instead of tab etc)
Thanks in advance
You can specify the hex/octal code of any valid Unicode delimiter in the FIELD_DELIMITER option of the File Format. From the documentation:
The specified delimiter must be a valid UTF-8 character and not a random sequence of bytes.
For example, for fields delimited by the thorn (Þ) character, specify the octal (\336) or hex (0xDE) value. Also accepts a value of NONE.

Is using ASCII 10 inside a HL7 segment a valid way to represent a new line?

Placing an ASCII 10 (0A) character somewhere inside of a segment of an HL7 message to represent a new line character. Is this valid?
From what I can see it is recommend to use \X0D\ or \X0D0A\ to represent a new line character for plain text format HL7. Is using just the 0A ASCII character explicitly invalid HL7?
To respond to the question "Is using just the 0A ASCII character explicitly invalid HL7?":
The character 0A is not mentioned anywhere in the HL7 specs as being special.
Extract from the HL7 2.5 US specs:
2.5.4 Message delimiters
In constructing a message, certain special characters are used. They are the segment terminator, the field
separator, the component separator, subcomponent separator, repetition separator, and escape character. The
segment terminator is always a carriage return (in ASCII, a hex 0D). The other delimiters are defined in the
MSH segment, with the field delimiter in the 4th character position, and the other delimiters occurring as in
the field called Encoding Characters, which is the first field after the segment ID. The delimiter values used
in the MSH segment are the delimiter values used throughout the entire message. In the absence of other
considerations, HL7 recommends the suggested values found in Figure 2-1 delimiter values.
Strictly speaking this would mean that you could use the character 0A just as any of the characters other than the 6 previously mentioned.
<end of "formal" reply>
That being said, I concur with Dale H. that you should better stay away from using this character in the content of an HL7 message. Since most editors (except old-fashioned Notepad on Windows) will display this character as a new line, you might unwillingly think that a segment was truncated or malformed. And I've had at least one instance where the interface engine indeed handled that character as a segment termination (which in itself is invalid, and the interface engine build was modified to not do this anymore).
So better avoid this. But in situations where you don't control the output, it doesn't seem to be a formally disallowed character...
Linefeeds (0x0A) are not allowed in HL7 messages. If you edit messages with notepad, wordpad and many other text editors, they will convert carriage returns (0x0D) to CR/LF (0x0D 0x0A) and if you save, you now have a corrupt HL7 message. Avoid LFs (0x0A).
If you only send 0A then there is no way to determine that you wanted ASCII 10/line feed and it would be assumed you wanted a zero and an A.
Standard HL7 with the escape character being a \, then yes the recommended way would be \X0A\. The \X representing the start of hexadecimal data, followed by two-character hexadecimal values, ending with a \.
That being said, if you are sending this data to a system then they should be able to tell you what they accept for lines feeds. I've seen systems that use \.br\ or the repetition character ~ to determine a new line. And sometimes they want repeating segments. For example below, each OBX segment is a new line of a report in the system.
OBX|1|TX|||This is line one
OBX|2|TX|||This is line two

MacVim Replace All Issue

I have an html file that I need to replace some characters with html entities. Right now I'm trying to replace — with — but when I use the Replace All button, the result is that all of those instances of — are replaced with —mdash;
I thought maybe escaping the "&" will work, so I changed the Replace with value to \— but that just results in \—mdash;
The strange thing is that if I go to each, one by one, i.e., click Next, then click Replace, and so on, then it replaces it correctly.
Is this a bug in MacVim? Or am I missing something?
Enter into command line:
:%s/—/\—/g
Also it's possible to get character code. Place your cursor on the character and press ga. Use decimal, hex or octal code into replacement string:
\%d match specified decimal character
\%x match specified hex character
\%o match specified octal character
\%u match specified multibyte character
\%U match specified large multibyte character
:%s/\%d8212/\$mdash;/g

Parse /var/email/username file in Ruby

For some reason I need to fetch emails from /var/mail/username file. It seems like an append only file.
My question is, is it safe to parse the content of the /var/email/username file depending on the first line From username#host Mon Jun 20 16:50:15 2011? What if the similar pattern found inside the email body?
Furthermore, is there any opensource ruby script available for reference?
Yes, that seems like more or less the right way to parse the mbox format - from a quick scan of the RFC specification:
The structure of the separator lines
vary across implementations, but
usually contain the exact character
sequence of "From", followed by a
single Space character (0x20), an
email address of some kind, another
Space character, a timestamp sequence
of some kind, and an end-of- line
marker.
And...
Many implementations are also known
to escape message body lines that
begin with the character sequence of
"From ", so as to prevent confusion
with overly-liberal parsers that do
not search for full separator
lines. In the common case, a leading
Greater-Than symbol (0x3E) is used
for this purpose (with "From "
becoming ">From "). However, other
implementations are known not to
escape such lines unless they are
immediately preceded by a blank line
or if they also appear to contain
an email address and a timestamp.
Other implementations are also
known to perform secondary escapes
against these lines if they are
already escaped or quoted, while
others ignore these mechanisms
altogether.
Update:
There's also this: https://github.com/meh/ruby-mbox

Resources