Unknown Character

Unknown Character - oracle

Facing a typical issue of some unknown character.
Actually trying to compile some packages in database through script and got an error as below:
SP2-0734: unknown command beginning "?SET DEF..." - rest of line ignored.
When i open the log file in notepad++ it shows the line as shown above.
Now, if I open the same log file in scite editor it shows the same file as:
SP2-0734: unknown command beginning "ï»¿SET DEF..." - rest of line ignored.
Not getting what could be the issue.
Any help would be welcomed.

Your script has an unprintable character at the start (as you discovered from comments), which some editors don't display at all, and others display as an unknown character. "ï»¿" is the byte order mark:
The UTF-8 representation of the BOM is the byte sequence
0xEF,0xBB,0xBF. A text editor or web browser interpreting the text as
ISO-8859-1 or CP1252 will display the characters ï»¿ for this.
From that article some editors (notable Notepad) add that automatically. It should be safe to open the file with a hex editor and remove the extra character, and you'll then be able to run the script normally.

Related

Why would an auto conversion of LF to CRLF by Xerces result in CRCRLF?

From the Xerces documention on setNewLine, “However, Xerces-C++ always uses LF when this property is set to null since otherwise automatic translation of LF to CR-LF on Windows for text files would result in such files containing CR-CR-LF. If you need Windows-style end of line sequences in your output, consider writing to a file opened in text mode or explicitly set this property to CR-LF.” That statement makes no sense to me.
https://xerces.apache.org/xerces-c/apiDocs-3/classDOMLSSerializer.html#a56882d2fe0b4a0ecb1b3968febbcf4a3
Why an auto conversion of line endings results in a duplicate CR is beyond me. I do not understand why that would ever be reasonable. I have tried changing the code to explicitly set the line ending to CR-LF as described in the documentation and that does not work. I still end up with xml files that have CRCRLF as the line ending and then I have to manually remove the duplicate CR with a text editor such as notepad++.

Ruby invalid multibyte char error (Sep 2019)

My script fails on this bad encoding, even I brought all files to UTF-8 but still some won't convert or just have wrong chars inside.
It fails actually on var assignment step.
Can I set some kind of error handling for this case like below so my loop will continue. That ¿ causes all problem.
Need to run this script all the way without errors. Tried already encoding und force_encoding and shebang line. Is Ruby has any kind of error handling routing so I can handle that bad case and continue with the rest of script? How to get rid of this error invalid multibyte char (UTF-8)
line = '¿USE [Alpha]'
lineOK = ' USE [Alpha] OK line'
>ruby ReadFile_Test.rb
ReadFile_Test.rb:15: invalid multibyte char (UTF-8)

I could reproduce your issue by saving the file with ISO-8859-1 encoding.
Running your code with the file in this non UTF8-encoding the error popped up. My solution was to save the file as UTF-8.
I am using Sublime as text editor and there is the option 'file > save with encoding'. I have chosen 'UTF-8' and was able to run the script.
Using puts line.encoding showed me UTF-8 then and no error anymore.
I suggest to re-check the encoding of your saved script file again.

Corruption when using certain batch variable names in custom build command

I have a VS2013 project with a custom build command. In the command script I set an environment variable, and read it out again in the same script. I can confirm by calling set that setting the variable works. However, depending on the variable name, I can't read it out again.
The following works as expected when run as a batch script:
set AVAR=xxx
set ABLAH=xxx
set BBLAH=xxx
set DEV=xxx
set #ABLAH=xxx
echo %AVAR%
echo %ABLAH%
echo %BBLAH%
echo %DEV%
echo %#ABLAH%
But produces the following output in the project:
1> xxx
1> «LAH
1> »LAH
1> ÞV
1> xxx
In this case, the name AVAR works, but many others don't. Also, variables starting with # seem to work. Any idea what is going on?

I've found the solution. Visual Studio (msbuild) converts %XX escape sequences like in URLs. I only expected it to so in URLs, like browsers do. However, it seems to replace them everywhere.
So when it encounters %ABCDE%, it recognizes %AB and inserts the character « = 0xAB, giving «CDE% to the batch interpreter. But if the code is not a valid hexadecimal number, it silently ignores it, and the interpreter sees the right characters. That's why variable names with # at the beginning always worked.
So the solution is to escape at least all % in front valid hex codes 00-FF, better even all of them, with %25.
An easy solution would be to just edit the corresponding commands in the GUI (via project properties), and not directly in the .vcxproj or .props file. This way, VS inserts the correct escape codes. In my case this was not possible since the commands were defined as user macros (Property Pages: Common Properties/User Macros). My commands span multiple lines, but the user macro editor only supports single lines.
Another thing to watch out for is that it not only replaces percent signs. Other symbols have special meaning and have to be replaced, too. (This goes beyond XML entities, like & -> &.) Here is a list of special characters from MSDN. The characters are: % $ # ' ; ? *. It doesn't seem to be necessary to replace all of them all the time, but if you notice funky behavior then this is a thing to look at. You can try to enter these characters through the GUI and see how and if VS escapes them in the project file.
On other character to note especially is the semicolon. If you define a property with unescaped semicolons, like <MyPaths>DirA;DirB</MyPaths>, msbuild/VS will internally convert them to newlines (well, or it splits the property into a list or something). But it will still show the paths as separated with semicolons in the property pages! Except when you click the dropdown button next to a property and select <Edit...>, then it will show the paths as a list or separated by newlines! This is completely invisible most of the time, except when you set a property not in XML or the GUI, but you are reading the output of a command into a property. In this case the command must output newlines, if you want the effect of a semicolon. Otherwise you don't get multiple paths, but one long path with semicolons in it.

Batch files are usually in North American and Western European countries "ASCII" files using an OEM code page like code page 850 (OEM multilingual Latin I) or code page 437 (OEM US) and not code page Windows-1252 as used usually for single byte encoded text files. The code page to use for a batch file depends on local settings for non Unicode files in console. The code page does not matter if just characters with a code value smaller 128 are used in batch file, i.e. the batch file is a real ASCII file.
Therefore make sure that you edit and save the batch file as ASCII file using the right code page and not as Unicode file using UTF-8, UTF-16 Little Endian or UTF-16 Big Endian. Editor of Visual Studio uses by default UTF-8 encoding for the files. This is the wrong encoding for batch files.
Character « has in table of code page 850 the code value 174 decimal (0xAB). In table of code page 1252 code value 174 is for character ® which is an indication that you want to output in batch file characters encoded in UTF-8 (also code value 174 for character ®) or Windows-1252.
A simple batch code for demonstration stored as ANSI file with code page Windows-1252.
#echo off
cls
echo This batch file was saved as ANSI file using code page Windows-1252.
echo.
echo Registered trademark symbol ® has code value 174 in Windows-1252.
echo.
echo But active code page is not Windows 1252 in console window.
echo.
chcp
echo.
echo Therefore the left guillemet character is output instead of registered
echo trademark symbol as this character has in code page 850 code value 174.
echo.
echo Press any key to continue ...
pause>nul
And batch files are for DOS/Windows and should therefore use carriage return + line-feed as line terminator instead of just line-feed (UNIX) or just carriage return (old MAC).
Some text editors display line terminator type and encoding respectively code page somewhere in status bar at bottom of main application window for active file.

How to get a utf code from symbol in linux

I'm struggling with a special symbol in a text file on linux. I actually successfully pasted it between the following letters "a‏a" (my cursor in Geany stops but no character is displayed).
I'd like to know what's the easiest way to get its utf8 code (in the form U+0000). I'm using ubuntu and geany and I tried hexdump on a file containing it but I'm obviously missing something.

You could open the file with vim, put the text cursor over the character, then type 'ga' (without quotes) and it will display the character code in decimal, hex and octal in the status line.

Is it possible to run a SQLPLUS script on a file encoded as UTF-8 with BOM

I'm trying to run a collection of scripts which have been auto-generated from a large number of sources. Unfortunately some of these have been generated as UTF-8 with BOM. I have in place a system for automatically removing the BOM, but its a bit of a messy process.
Failing to remove the BOM generates the error:
SP2-0042: unknown command "ï»¿" - rest of line ignored.
Is it possible to run SQLPLUS on a script file which has a BOM?

It is possible to run SQLPLUS with such script, but SQLPLUS will indicate an error on the first line because of BOM.
Probably you wanted to ask if you can avoid this error - it is not possible, AFAIK. Erwin thinks so too.
You can workaround losing any information by generating those files with an empty first line. Then you can just ignore this error.

This has been a bug open with Oracle for over 6 years now, but it doesn't look like they are interested in fixing it.
Their 'recommended workaround' (Doc ID 788156.1 Section C.6) is to strip the BOM or make your first script line a comment, and then ignore this error.
SP2-0042: unknown command "∩╗┐" - rest of line ignored.
Or
SP2-0734: unknown command beginning "-- Commen..." - rest of line
ignored.
Bug 13515585 Details (requires OTN login):
Bug 13515585: ADD SUPPORT FOR THE UTF-8 BOM IN SQLPLUS
Bug Status: Internal (Oracle) Review
Created: 19-Dec-2011
Updated: 29-Sep-2015

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio