cross platform logging in Golang - windows

I am developing a Go program on a Mac which has Parallels installed with Windows so that I can test on both platforms. My program works well. I can compile a Windows ".exe" file on my Mac and run it from Windows and it works well except the log file.
I have set the logger to write its output to a file like so:
log.SetOutput(projectsLog)
Where projectsLog is declare above it as shown below:
projectsLog *os.File
I am using log.Printf statements since I want formatted output. An example is shown below:
log.Printf("Error: wrong Hra Class value %s in row %v for project/path %s", hraClass, (rowNum + 1), testDir)
This is working great on Mac. Each line using log.Printf is logged on a separate line, but on Windows the line breaks do not show and I get one lines without line breaks. I am well aware of "\r" and "\r\n" difference between unix and windows. But I thought that log.Printf will behave appropriately based on the platform it is run on?
If my assumption is wrong then what are some of the options that I have to make sure that the log file is readable on Windows? If I can, I do not want to pass flags, e.g., -platform windows or some such thing. Can this be handled in a transparent manner?

As noted, the fmt package always uses \n as the newline "sequence" regardless of the OS (on Windows too). The log package uses fmt under the hood, so the same applies to log too. When functions that do not end with ...ln() are called (e.g. log.Printf()), a \n will be printed explicitly as documented at Logger.Output() (to which log.Printf() forwards to).
Just deal with \n as the newline. If you do need to print \r\n, you have to handle that manually by appending a \r character at the end of the format string, e.g.:
log.Printf("This will be terminated by CR+LF\r") // \n is appended automatically
You may create a wrapper function for it:
func winprintf(format string, a ...interface{}) {
log.Printf(format+"\r", a...)
}
Note that this however will only print \r\n at the end of the log entry; but if you use \n inside the format string or the arguments are strings (or will result in a string by calling their String() method) containing \n, those will not turn into \r\n automatically. You may use strings.Replace() to handle those too.

Related

InstallScript GetLine() can not read text file contains result from command prompt

My Installation needs to check the result of a command from cmd.exe. Thus, I redirect the result of the command to a text file and then try to read the file to get the result as follows:
// send command to cmd to execute and redirect the result to a text file
// try to read the file
szDir = "D:\\";
szFileName = "MyFile.txt";
if Is(FILEEXISTS, szDir ^ szFileName) then
listID = ListCreate(STRINGLIST);
if listID != LIST_NULL then
if OpenFIleMode(FILE_MODE_NORMAL) = 0 then
if OpenFile(nFileHandle, szDir, szFileName) = 0 then
// I run into problem here
while (GetLine(nFileHandle, szCurLine) = 0 )
ListAddString(listID, szCurLine, AFTER);
endwhile;
CloseFile(nFileHandle);
endif;
endif;
endif;
endif;
The problem is that right after the command prompt is executed and the result is redirected to MyFile.txt, I can set open file mode, open the file but I can not read any text into my list. ListReadFromFile() does not helps. If I open the file, edit and save it manually, my script works.
After debugging, I figured that GetLine() returns an error code (-1) which means the file pointer must be at the end of file or other errors. However, FILE_MODE_NORMAL sets the file as read only and SET THE FILE POINTER AT THE BEGINNING OF THE FILE.
What did I possibly do wrong? Is this something to do with read/write access of the file? I tried this command without result:
icacls D:\MyFile.txt /grant Administrator:(R,W)
I am using IstallShield 2018 and Windows 10 64-bit btw. Your help is much appreciated.
EDIT 1: I suspected the encoding and tried a few things:
After running "wslconfig /l", the content of MyFile.txt opened in Notepad++ is without an encoding, but still appeared normal and readable. I tried to converted the content to UTF-8 but it did not work.
If I add something to the file (echo This line is appended >> MyFile.txt), the encoding changed to UTF-8, but the content in step 1 is changeed also. NULL (\0) is added to between every character and even repelace new line character. Maybe this is why GetLine() failed to read the file.
Work around: after step 1, I run "find "my_desired_content" MyFile.txt" > TempFile.txt and read TempFile.txt (which is encoded in UTF-8).
My ultimate goal is to check if "my_desired_content" apeears in the result of "wslconfig /l" so this is fine. However, what I don't understand is that both MyFile.txt and TempFile.txt are created from cmd command but they are encoded differently?
The problem is due to the contents of the file. Assuming this is the file generated by your linked question, you can examine its contents in a hex editor to find out the following facts:
Its contents are encoded in UTF-16 (LE) without a BOM
Its newlines are encoded as CR or CR CR instead of CR LF
I thought the newlines would be more important than the text encoding, but it turns out I had it backwards. If I change each of these things independently, GetLine seems to function correctly for either CR, CR CR, or CR LF, but only handles UTF-16 when the BOM is present. (That is, in a hex editor, the file starts with FF FE 57 00 instead of 57 00 for a file starting with the character W.)
I'm at a bit of a loss for the best way to address this. If you're up for a challenge, you could read the file with FILE_MODE_BINARYREADONLY, and can use your extra knowledge about what should be in the file to ensure you interpret its encoding correctly. Note that for most of UTF-16, you can create a single code unit by combining two bytes in the following manner:
szResult[i] = (nHigh << 8) + nLow;
where nHigh and nLow are probably values like szBuffer[2*i + 1] and szBuffer[2*i], assuming you filled a STRING szBuffer by calling ReadBytes.
Other unproven ideas include editing it in binary to ensure the BOM (FF FE) is present, figuring out ways to ensure the file is originally created with the BOM, figuring out ways to create it in an alternate encoding, finding another command you can invoke to "fix" the file, or lodging a request with the vendor (my employer) and hoping the development team changes something to better handle this case.
Here's an easier workaround. If you can safely assume that the command will append UTF-16 characters without a signature, you can append this output to a file that has just a signature. How do you get such a file?
You could create a file with just the BOM in your development environment, and add it to your Support Files. If you need to use it multiple times, copy it around first.
You could create it with code. Just call the following (error checking omitted for clarity)
OpenFileMode(FILE_MODE_APPEND_UNICODE);
CreateFile(nFileHandle, szDir, szFileName);
CloseFile(nFileHandle);
and if szDir ^ szFileName didn't exist, it will now be a file with just the UTF-16 signature.
Assuming this file is called sig.txt, you can then invoke the command
wslconfig /l >> sig.txt to write to that file. Note the doubled >> for append. The resulting file will include the Unicode signature you created ahead of time, plus the Unicode data output from wslconfig, and GetLine should interpret things correctly.
The biggest problem here is that this hardcodes around the behavior of wslconfig, and that behavior may change at any point. This is why Christopher alludes to recommending an API, and I agree completely. In the mean time, You could try to make this more robust by invoking it in a cmd /U (but my understanding of what that does or guarantees is fuzzy at best), or by trying the original way and then with the BOM.
This whole WSL thing is pretty new. I don't see any APIs it but rather then screen scrapping command outputs you might want to look at this registry key:
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Lxss
It seems to have the list of installed distros that come from the store. Coming from the store probably explains why this is HKCU and not HKLM.
A brave new world.... sigh.

Bash script sourcing config file but can't use vars in arithmetic

This is killing me. I have a config file, "myconfig.cfg", with the following content:
SOME_VAR=2
echo "I LOVE THIS"
Then I have a script that I'm trying to run, that sources the config file in order to use the settings in there as variables. I can print them out fine, but when I try to put one into a numeric variable for use in something like a "seq " command, I get this weird "invalid arithmetic operator" error.
Here's the script:
#!/bin/bash
source ./myconfig.cfg
echo "SOME_VAR=${SOME_VAR}"
let someVarNum=${SOME_VAR}
echo "someVarNum=${someVarNum}"
And here's the output:
I LOVE THIS
SOME_VAR=2
")syntax error: invalid arithmetic operator (error token is "
someVarNum=
I've tried countless things that theoretically shouldn't make a difference, and, surprise, they don't. I simply can't figure it out. If I simply take the line "SOME_VAR=2" and put it directly into the script, everything's fine. I'm guessing I'll have to read in the config file line by line, split the strings by "=", and find+create the variables I want to use manually.
The error is precisely as indicated in a comment by #TomFenech. The first line (and possibly all the lines) in myconfig.cfg is terminated with a Windows CR-LF line ending. Bash considers CR to be an ordinary character (not whitespace), so it will set SOME_VAR to the two character string 2CR. (CR is the character with hex code 0x0D. You could see that if you display the file with a hex-dumper: hd myconfig.cfg.)
The let command performs arithmetic on numbers. It also considers the CR to be an ordinary character, but it is neither a digit nor an operator so it complains. Unfortunately, it does not make any attempt to sanitize the display of the character in the error message, so the carriage return is displayed between the two " symbols. Consequently, the end of the error message overwrites the beginning.
Don't create Unix files with a Windows text editor. Or use a utility like dos2unix to fix them once you copy them to the Unix machine.

Why is \r\n being converted to \n when a string is saved to a file?

The string is originating as a return value from:
> msg = imap.uid_fetch(uid, ["RFC822"])[0].attr["RFC822"]
In the console if I type msg, a long string is displayed with double quotes and \r\n separating each line:
> msg
"Delivered-To: email#test.com\r\nReceived: by xx.xx.xx.xx with SMTP id;\r\n"
If I match part of it with a regex, the return value has \r\n:
> msg[/Delivered-To:.*?\s+Received:/i]
=> "Delivered-To: email#test.com\r\nReceived:"
If I save the string to a file, read it back in and match it with the same regex, I get \n instead of \r\n:
> File.write('test.txt', msg)
> str = File.read('test.txt')
> str[/Delivered-To:.*?\s+Received:/i]
=> "Delivered-To: email#test.com\nReceived:"
Is \r\n being converted to \n when the string is saved to a file?
Is there a way to save the string to a file, read it back in without the line endings being modified?
This is covered in the IO.new documentation:
The following modes must be used separately, and along with one or more of the modes seen above.
"b" Binary file mode
Suppresses EOL <-> CRLF conversion on Windows. And
sets external encoding to ASCII-8BIT unless explicitly
specified.
"t" Text file mode
In other words, Ruby, like many other languages, senses the OS it's on and will automatically translate line-ends between "\r\n" <-> "\n" when reading/writing a file in text mode. Use binary mode to avoid translation.
str = File.read('test.txt')
A better practice would be to read the file using foreach, which negates the need to even care about line-endings; You'll get each line separately. An alternate is to use readlines, however it uses slurping which can be very costly on large files.
Also, if you're processing mail files, I'd strongly recommend using something written to do so rather than write your own. The Mail gem is one such package that's pre-built and well tested.

GoldParser: Accept programs not ending with an empty line

I'm rewriting a GoldParser Grammar for VBScript. In VBScript Statements are terminated using either a newline or ':'. Therefore i use the following terminal:
NewLine = {All Newline}
| ':'
Because every statement has to end with the Newline terminal, only programs ending with an empty line are accepted. How can i extend the newline terminal to also accept programs not ending with an empty line? I tried the following:
NewLine = {All Newline}
| ':'
| {EOF}
This does not work because the {EOF} (End of File) group does not exist.
EOF is a special token and I'm not aware of any syntax allowing you to use it in a production rule. It is emitted when the tokenizer receives no more data, and as such it is not a control character you could use in a terminal definition either.
That being said, you have different possibilities to parse the (strictly speaking invalid) input. The simplest may be to just append a newline at the end of the string or text being tokenized. While this will not make it parse correctly in the GOLD Builder test window, it will make your code process the data as expected and it will not add complexity to the grammar.

Does Perl's /m regex modifier match differently on Windows?

The following Perl statements behave identically on Unixish machines. Do they behave differently on Windows? If yes, is it because of the magic \n?
split m/\015\012/ms, $http_msg;
split m/\015\012/s, $http_msg;
I got a failure on one of my CPAN modules from a Win32 smoke tester. It looks like it's an \r\n vs \n issue. One change I made recently was to add //m to my regexes.
For these regexes:
m/\015\012/ms
m/\015\012/s
Both /m and /s are meaningless.
/s: makes . match \n too.
Your regex doesn't contain .
/m: makes ^ and $ match next to embedded \n in the string.
Your regex contains no ^ nor $, or their synonyms.
What is possible is indeed if your input handle (socket?) works in text mode, the \r (\015) characters will have been deleted on Windows.
So, what to do? I suggest making the \015 characters optional, and split against
/\015?\012/
No need for /m, /s or even the leading m//. Those are just cargo cult.
There is no magic \n. Both \n and \r always mean exactly one character, and on all ASCII-based platforms that is \cJ and \cM respectively. (The exceptions are EBCDIC platforms (for obvious reasons) and MacOS Classic (where \n and \r both mean \cM).)
The magic that happens on Windows is that when doing I/O through a file handle that is marked as being in text mode, \r\n is translated to \n upon reading and vice versa upon writing. (Also, \cZ is taken to mean end-of-file – surprise!) This is done at the C runtime library layer.
You need to binmode your socket to fix that.
You should also remove the /s and /m modifiers from your pattern: since you do not use the meta-characters whose behaviour they modify (. and the ^/$ pair, respectively), they do nothing – cargo cult.
Why did you add the /m? Are you trying to split on line? To do that with /m you need to use either ^ or $ in the regex:
my #lines = split /^/m, $big_string;
However, if you want to treat a big string as lines, just open a filehandle on a reference to the scalar:
open my $string_fh, '<', \ $big_string;
while( <$string_fh> ) {
... process a line
}

Resources