Problem when COPY INTO TABLE is performed with special characters into the files - utf-8

I have a file with a column data something like this. As can be seen there is a special character in between.
The original datatype was varchar(60). When COPY INTO TABLE is performed then it is throwing an error. I change the collation to utf-8 and still doing the same. Is there a way to solve this problem?
ABC COMPANY ▒ Sample Data
Thanks!

Related

data factory special character in column headers

I have a file I am reading into a blob via datafactory.
Its formatted in excel. Some of the column headers have special characters and spaces which isn't good if want to take it to csv or parquet and then SQL.
Is there a way to correct this in the pipeline?
Example
"Activations in last 15 seconds high+Low" "first entry speed (serial T/a)"
Thanks
Normally, Data Flow can handle this for you by adding a Select transformation with a Rule:
Uncheck "Auto mapping".
Click "+ Add mapping"
For the column name, enter "true()" to process all columns.
Enter an appropriate expression to rename the columns. This example uses regular expressions to remove any character that is not a letter.
SPECIAL CASE
There may be an issue with this is the column name contains forward slashes ("/"). I accidentally came across this in my testing:
Every one of the columns not mapped contains forward slashes. Unfortunately, I cannot explain why this would be the case as Data Flow is clearly aware of the column name. It can be addressed manually by adding a Fixed rule for EACH offending column, which is obviously less than ideal:
ANOTHER OPTION
The other thing you could try is to pre-process the text file with another Data Flow using a Source dataset that has no delimiters. This would give you the contents of each row as a single column. If you could get a handle on the just first row, you could remove the special characters.

Character encoding issues - Oracle SQL

I made a simple query to a remote database. The table where I make query has all fields with VARCHAR2. However, some fields returns "?" in characters like º, £. I checked enconding and get:
NLS_CHARACTERSET: AL32UTF8
NLS_NCHAR_CHARACTERSET:AL16UTF16
Checking my /etc/default/locale file. These are the results:
LANG="en_US.UTF-8"
LC_NUMERIC="pt_BR.UTF-8"
LC_TIME="pt_BR.UTF-8"
LC_MONETARY="pt_BR.UTF-8"
LC_PAPER="pt_BR.UTF-8"
LC_NAME="pt_BR.UTF-8"
LC_ADDRESS="pt_BR.UTF-8"
LC_TELEPHONE="pt_BR.UTF-8"
LC_MEASUREMENT="pt_BR.UTF-8"
LC_IDENTIFICATION="pt_BR.UTF-8"
The enconding from both edges are UTF-8. Is there another configuration that I missing?
When you say 'some fields return "?" in characters like ...' I assume that is shown on your screen, right? You don't know what is REALLY returned, you only know what's on your screen.
To see what is REALLY returned, you could do something like
select dump('£500') from dual;
DUMP('£500')
--------------------------
Typ=96 Len=4: 163,53,48,48
EDIT: As discussed in the comments below, if you type EXACTLY that command at your terminal and you do, in fact, have a display problem, you will see garbage on the way in. Rather, to see what is stored in the database, you must refer to an actual table, and a column that has those string values in it. For example if the column name is COL1 in the table TBL, and there is also an ID column and for ID = 1000 you have a COL1 value with the pound sign in it, run
select dump(COL1) from TBL where ID = 1000;
Obviously, there are no issues with the INPUT since the input no longer has a pound sign in it (like my first example did). But on the way out, the DUMP may show the proper character is there - however your display is not able to show it correctly.
END EDIT
If you see the code 163 in the DUMP, that means the pound sign is stored correctly in the database, and the issue is just how it is displayed on your screen. In that case, you may have an issue with your NLS_LANG setting. There is excellent information here:
http://www.oracle.com/technetwork/products/globalization/nls-lang-099431.html
If you find that you have to work with different character sets often, you may benefit from reading this article carefully. It will show you how to find out what your current character set is, how to change it, and why the "obvious" things one would look at are in fact not very helpful. The issue is not too complicated, but not trivial either.

How do I convert these characters to their readable form

I have some columns in my oracle database table which is having some �� in them.
How do I decode it to it's original readable form.
I know it's is related to encoding but I need to find a solution for this.
In my php application I get those characters as plain '??'.
I am using sql developer to view records.
You have to convert them from UTF8 to your current encoding. Or vice versa.

Convert ascii files into normal human-readable file

I have got ASCII files and want to convert them into maybe excel or tab/csv delimited text file. The file is a table with field name and field attributes. It also includes index name, table name and field(s) to index if required depending on the software. I don't think it is necessary to think of this. Well, field name and field attributes are enough, I hope so. I just want the information hidden inside. Can you all experts help me to get this done.
The lines are something like this:
10000001$"WORD" WORD$10001890$$$$495.7$$$N$$
10000002$11-word-word word$10000002$$$$$$$Y$$
10000003$11-word word word$10033315$0413004$$$$$$N$$
10000004$11-word word word$10033315$$$$$$$Y$017701$
The general answer, before knowing your ascii file in details, operating system, and so on, would be:
1 - cut the top n-lines, that containg the information you don't want. Leave the filds names, if you want to.
2 - check if the fields are separated by a common character, for example, one comma ,
3 - import the file inside a spreadsheet program, like Excel or OpenOffice Calc. In OOCalc, choose to import the file, then select the correct separating character
that's all.

Strings are appended with special characters when returned from the database

I am using NHibernate to query an Oracle 8i database. The problem is that all the strings in the returned objects are postfixed with special characters. For e.g.
CUSTOMER,ONE�������
The nhibernate field type is AnsiString and the Oracle datatype is CHAR(20) and the Character set is CHAR_CS. I am totally new with Oracle so i don't have a clue whats going on :(
CHAR(20) means the field is padded as necessary to be exactly 20 characters long. The padding character is a blank.
There must be a problem somewhere in your character set settings if padding characters appear as question marks. You may find more insight on your problem here.
What you need here is to trim the returned strings, or better yet move to VARCHAR2(20).
I couldn't find a proper solution for this issue but changing the nhibernate driver from 'OracleClientDriver' to 'OleDbDriver' solved this issue. Still if anyone knows how to tackle this issue properly please let me know as I don't like using the OldDbDriver for accessing Oracle because of possible compatibility issues.

Resources