Powershell uses different encoding in ISE vs Shell - oracle

I have a script that merges Data from an Oracle Database (with SQLPlus) and creates CSV Files from that.
When I run this script from PowerShell ISE everything seems fine, but when I run it from a normal powershell-Session or via right-click -> Execute with Powershell, characters like ä,ö,ü are not dispalyed correctly in the CSV-File instead characters like "÷" and "³" are being displayed.
I already set the encoding for the CSV-Export to UTF8 with
export-csv -Encoding UTF8
But the Problem doesn't seem to be the export but how I get the Data from the SQLPlus session.
I already checked which encoding the Session runs in by logging [Console]::OutputEncoding
In the ISE the Encoding is iso-8859-1 and the Powershell-Encoding is ibm850.
So I guess my solution would be to force the Powershell-Session to run with the same Encoding as the ISE-Session does. I just don't know how to set this Parameter.
I already tried executing the following commands before I run the Script:
set NLS_LANG=AMERICAN_AMERICA.UTF8
set NLS_LANG=UTF8
$OutputEncoding = [System.Text.UTF8Encoding]::new()
None of this worked and the OutputEncoding remained ibm850
I also tried to alter the Oracle Session with
ALTER SESSION SET NLS_LANGUAGE='UTF8'
But that returns an error.
Minimal Coding Example would be as following:
$sqlQuery_Example = #"
set pagesize 0
set feedback off
set heading off
set linesize 15000;
set colsep ;;
set headsep ;;
Select * FROM table
"#
$ExampleSQL = $sqlQuery_Example | sqlplus -silent USER/PW#DB
$ExampleCSV = ConvertFrom-Csv $ExampleSQL
$ExampleCSV | export-csv C:\TEMP\example.csv -Encoding UTF8 -NoTypeInformation -Delimiter ';'
I use the PowerShell Version 5.1.17763.2268.
Can someone help me change the Encoding of the Powershell-Session to be the same as in the ISE-Session?

Solution to my script was adding the following line before the SQLPlus session was started:
$OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = [Text.Encoding]::GetEncoding((Get-Culture).TextInfo.ANSICodePage)
Now Ö, Ü, Ä and so on are being displyed correclty

When running ps1 files the PowerShell instance will default to whatever encoding the file is detected as.
Usually this can be somewhat unreliable and usually will default to the system encoding.
Saving ps1 files with "UTF8 with BOM" encoding (sometimes refered to UTF8 Signature) will make PowerShell run in UTF8 mode when you are running a script.

Related

Encoding problem if I put my code in module or other ps1 file

My code was working well with special chars. I could use Write-Host "é" without any issue.
And then I moved some of my functions to an other PS1 file that I "dot sourced" (using Import-Module does the same), and I got encoding errors : prénom became prénom
I don't understand anything about encoding. VS Code doesn't allow me to change the encoding of a file. It has a parameter to set the default encoding but its defaulted on UTF8 and when I set Windows1252 it changes nothing. If I use Geany to update the encoding to Windows1252 it works... until I save the file again with VS Code.
Everything was working well when all my code was in the same file. Why would creating this second .ps1 file (which I created from the Windows Explorer) be a problem?
Working on Windows 10, in french, with VS Code 1.50.
Thank you in advance

Doxygen doesn't read Doxyfile when PowerShell scripts updates it

Good day Stackoverflow.
As the title says, I have an issue with Doxygen.
Description
A PowerShell script modify the PROJECT_NUMBER variable of my Doxyfile.
Then it runs Doxygen, but it generates the documentation in HTML and LaTeX like it's reading a Default generated Doxyfile.
If I manually modify the Doxyfile before running this script, via Notepad++, Doxygen works perfectly, but once the script is ran, the issue appears.
I would also mention that my Doxyfile has:
GENERATE_HTML = YES
GENERATE_LATEX = NO
GENERATE_MAN = YES
In practice Doxygen behave like this:
.\doxygen.exe -g
\doxygen.exe .\Doxyfile
The bizzarre behaviour begins now!
Let's call my actual Doxyfile CustomConfig and the default generated DefaultConfig.
If I generate a DefaultConfig through .\doxygen.exe -g and then I overwrite its content with the text of CustomConfig via Notepad++, doxygen accepts the Doxyfile, as it should, and generates a correct output!
So the problem is not the Doxyfile content but PowerShell that modifies the file.
I've verified this by doing a simple copy&paste of the entire content:
Copy&Paste through Notepad++: WORK
Copy&Paste through PowerShell: DOESN'T WORK
PowerShell Script
# Replace the old PROJECT_NUMBER with the new one
$DOXY_PATH = $env:FS_OS + "\doc"
$CONFIG_PATH = $DOXY_PATH + "\bin\Doxyfile"
$BIN_PATH = $DOXY_PATH + "\bin\doxygen.exe"
$GIT_PATH = $env:FS_OS
$GIT_BRANCH = "Development"
# Get git commit number on the specified branch
$GIT_HASH = git log $GIT_BRANCH -1 --pretty=format:%H
$PRJ_CONTENT = Get-Content $CONFIG_PATH
$PRJ_NUM = "PROJECT_NUMBER = " + $GIT_HASH
$PRJ_CONTENT = $PRJ_CONTENT -replace "PROJECT_NUMBER\s*=\s*[A-z0-9]{40}",$PRJ_NUM
$PRJ_CONTENT | Out-File -FilePath $CONFIG_PATH
Start-Process -FilePath $BIN_PATH -ArgumentList "$CONFIG_PATH" -WorkingDirectory ($DOXY_PATH + "\bin")
Copy&Paste Script
$var = Get-Content "./doc/bin/Doxyfile.bak"
$var | Out-File -FilePath "./doc/bin/Doxyfile"
Thanks to #BenH for the comment, I've found the solution.
It looks like PowerShell writes to files automatically with BOM.
I've found a solution with the Accepted Answer from this question:
Using PowerShell to write a file in UTF-8 without the BOM

python3 subprocess returning local encoding bytes makes information lost

This is my code:
# -*- coding: utf-8 -*-
import subprocess as sp
import locale
LOCAL_ENCODING = locale.getpreferredencoding()
cmds = ['dir', '/b', '*.txt']
out = sp.check_output(cmds, shell=True)
print(out)
print(out.decode(LOCAL_ENCODING))
s = 'レミリア・スカレート.txt'
print(s.encode(LOCAL_ENCODING, 'replace'))
print(LOCAL_ENCODING)
# print(s.encode('utf-8'))
This is the output:
b'\xa5\xec\xa5\xdf\xa5\xea\xa5\xa2?\xa5\xb9\xa5\xab\xa5\xec\xa9`\xa5\xc8.txt\r\n'
レミリア?スカレート.txt
b'\xa5\xec\xa5\xdf\xa5\xea\xa5\xa2?\xa5\xb9\xa5\xab\xa5\xec\xa9`\xa5\xc8.txt'
cp936
(A text file named 'レミリア・スカレート.txt' is in the script directory.)
As the result shows, the bytes of the file name returned has been automatically encoded by local encoding, which can't totally encode the filename(Note the ? in the bytes), thus some information lost.
Environment:
- win10 Chinese Simplified
- python-3.5.1
My question is:
Is it possible to avoid the automatical local-encoding and get an utf-8(or some other specified encoding) bytes?
I read this issue, but got no solution :-(
1.For built-in command, solved by eryksun's answer:
out = sp.check_output('cmd.exe /u /c "dir /b *.txt"').decode('utf-16le'),
/u: Output UNICODE characters (UCS-2 le),
/c: Run Command and then terminate)
2.For external programs:[no general solution]
configure the output using proper encoding(by setting exteral programs' options or configurations, of course, such options may be nonexistent),
for example, in the latest winrar, one can set the encoding of console rar messages:rar lb -scur data > list.txt, will produce Unicode list.txt with archived file names

powershell character encoding from System.Net.WebClient

I am running the following command:
([xml](new-object net.webclient).DownloadString(
"http://blogs.msdn.com/powershell/rss.aspx"
)).rss.channel.item | format-table title,link
The output for one of the RSS items contains this weird text:
You Don’t Have to Be An Administrator to Run Remote PowerShell Commands
So, the question is:
Why the mix up in characters? What happened to the apostrophe? Why is the output rendered as Don’t when it should just render as Don't?
How would I get the correct character in the PowerShell standard output?
You need to set the encoding property of the webclient:
$wc = New-Object System.Net.WebClient
$wc.Encoding = [System.Text.Encoding]::UTF8
([xml]$wc.DownloadString( "http://blogs.msdn.com/powershell/rss.aspx" )).rss.channel.item | format-table title,link

How do I add a multiline REG_SZ string to the registry from the command line?

As part of a build setup on a windows machine I need to add a registry entry and I'd like to do it from a simple batch file.
The entry is for a third party app so the format is fixed.
The entry takes the form of a REG_SZ string but needs to contain newlines ie. 0xOA characters as separators.
I've hit a few problems.
First attempt used regedit to load a generated .reg file. This failed as it did not seem to like either either long strings or strings with newlines. I discovered that export works fine import fails. I was able to test export as the third party app adds similar entries directly through the win32 api.
Second attempt used the command REG ADD but I can't find anyway to add the newline characters everything I try just ends up with a literal string being added.
You can import multiline REG_SZ strings containing carriage return (CR) and linefeed (LF) end-of-line (EOL) breaks into the registry using .reg files as long as you do not mind translating the text as UTF-16LE hexadecimal encoded data. To import a REG_SZ with this text:
1st Line
2nd Line
You might create a file called MULTILINETEXT.REG that contains this:
Windows Registry Editor Version 5.00
[HKEY_CURRENT_USER\Environment]
"MULTILINETEXT"=hex(1):31,00,73,00,74,00,20,00,4c,00,69,00,6e,00,65,00,0d,00,0a,00,\
32,00,6e,00,64,00,20,00,4c,00,69,00,6e,00,65,00,0d,00,0a,00,\
00,00
To encode ASCII into UTF-16LE, simply add a null byte following each ASCII code value. REG_SZ values must terminate with a null character (,00,00) in UTF-16LE notation.
Import the registry change in the batch file REG.EXE IMPORT MULTILINETEXT.REG.
The example uses the Environment key because it is convenient, not because it is particularly useful to add such data to environment variables. One may use RegEdit to verify that the imported REG_SZ data contains the CRLF characters.
If you're not constrained to a scripting language, you can do it in C# with
Registry.CurrentUser.OpenSubKey(#"software\classes\something", true).SetValue("some key", "sometext\nothertext", RegistryValueKind.String);
You could create a VBScript(.vbs) file and just call it from a batch file, assuming you're doing other things in the batch other than this registry change. In vbscript you would be looking at something like:
set WSHShell = CreateObject("WScript.Shell")
WSHShell.RegWrite "HKEY_LOCAL_MACHINE\SOMEKEY", "value", "type"
You should be able to find the possible type values using Google.
Another approach -- that is much easier to read and maintain -- is to use a PowerShell script. Run PowerShell as Admin.
# SetLegalNotice_AsAdmin.ps1
# Define multi-line legal notice registry entry
Push-Location
Set-Location -Path Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System\
$contentCaption="Legal Notice"
$contentNotice= #"
This is a very long string that runs to many lines.
You are accessing a U.S. Government (USG) Information System (IS) that is provided for USG-authorized use only.
By using this IS (which includes any device attached to this IS), you consent to the following conditions:
-The USG routinely intercepts and monitors communications on this IS for purposes including, but not limited to, penetration testing, COMSEC monitoring, network operations and defense, personnel misconduct (PM), law enforcement (LE), and counterintelligence (CI) investigations.
etc...
"#
# Caption
New-ItemProperty -Path . -Name legalnoticetext -PropertyType MultiString -Value $contentCaption -Force
# Notice
New-ItemProperty -Path . -Name legalnoticetext -PropertyType MultiString -Value $contentNotice -Force
Pop-Location

Resources