Why does the database size reduce in PostgreSQL post migration from Oracle schema having lob, clob and blob datatypes
The main reason is that Postgres by default compresses values that are bigger than (approximately) 2000 bytes of data variable length data types - these are mainly text, varchar and bytea types.
Oracle will only compress the content of LOB columns if you are using the Enterprise Edition and enable the compression when defining the LOB column (the most important part is to use SecureFile instead of BasicFile).
Most probably your LOB columns where defined without using compression in Oracle and contain many values bigger than 2000 bytes, that's why you see a reduction in size due to Postgres' automatic compression.
Related
I am using Oracle SQL Developer 18.3 but when I want to edit(or insert) a column with RAW datatype it shows the field as read only and does not allow to edit.
As you may know Oracle SQL Developer shows RAW datatype as hex string despite BLOB datatype that it does not show the value but you can download and upload the BLOB data.
I know that I can update(or insert) the RAW data as hex string like this :
CREATE TABLE t1(the_id NUMBER PRIMARY KEY, raw_col RAW(2000));
INSERT INTO t1(the_id, raw_col) VALUES(1, '1a234c');
But I want do it by Oracle SQL Developer GUI.
Sorry, we do not have a 'raw' editor like we have for BLOBs, so it's up to using SQL.
If you want a reason for that omission, it's partly due to the fact that RAW is not a commonly used data type in Oracle Database.
Related: if you're talking about LONG RAW
We (Oracle) recommend you stop using it, and instead convert them to BLOBs.
The LONG RAW datatype is provided for backward compatibility with
existing applications. For new applications, use the BLOB and BFILE
datatypes for large amounts of binary data. Oracle also recommends
that you convert existing LONG RAW columns to LOB columns. LOB columns
are subject to far fewer restrictions than LONG columns. Further, LOB
functionality is enhanced in every release, whereas LONG RAW
functionality has been static for several releases.
Quote from documentation
The LOB datatypes for character data are CLOB and NCLOB. They can store up
to 8 terabytes of character data (CLOB) or national character set data (NCLOB).
and this is another quote from same page:
The CLOB and NCLOB datatypes store up to 128 terabytes of character data in the database. CLOBs store database character set data, and NCLOBs store Unicode national character set data.`
I am little confused, there is some misunderstanding in documentation or i miss something?
The difference stems from the fact that you can define LOBs with different "chunk" sizes. Plus their maximum size is limited by the number of database blocks used for them. If you create a database (or tablespace) with a larger blocksize this means a LOB can contain more data.
From the manual:
CLOB objects can store up to (4 gigabytes -1) * (the value of the CHUNK parameter of LOB storage) of character data
And the next sentence describes the relation to the blocksize:
If the tablespaces in your database are of standard block size, and if you have used the default value of the CHUNK parameter of LOB storage when creating a LOB column, then this is equivalent to (4 gigabytes - 1) * (database block size).
Setup
I have an oracle table that has couple of attributes and a CLOB datatype. The table below I have create with two ways ,each of which should give the same behavior.
CREATE TABLE DEMO(
a number (10, 2),
data CLOB
)
CREATE TABLE DEMO(
a number (10, 2),
data CLOB
) LOB (data) Stored AS (STORAGE IN ROW ENABLED)
Scenario
As per the oracle documentation when the CLOB is greater the 4000 bytes it will be stored outline else inline.
When I store the data in this table for a clob value say "Hello" and then I see the segment information for the "Demo table" and "Demo table LOB segment" , it shows that all the data is going to table and no new blocks are being consumed in the Lob Segment.
When I store a bigger data with total character less than 1500 , then also I get the same behavior as above.
But when I store a data with total character > 2000 and < 3000 , then the LOB data is going to the LOB segment even though total character are less than 3000.
Question
Why is the data smaller than 3000 characters is going to the LOB Segment ? . Is that each character takes 2 bytes , which justifies that data till 1500 is going to the data instead of Log Segment.
Problem
Lots of disk space is getting wasted because of the LOB Table , since the CHUNK size is 8kb and the data per block will always be around 3 - 4K character and in some cases exceeding that. So essential for each row 4Kb space is wasted and in out case of 20 mn rows , its running in 50's of GBs
This may explain the above behaviour..
"The CLOB and NCLOB datatypes store up to 4 gigabytes of character data in the database. CLOBs store database character set data and NCLOBs store Unicode national character set data. For varying-width database character sets, the CLOB value is stored in the database using the two-byte Unicode character set, which has a fixed width. Oracle translates the stored Unicode value to the character set requested on the client or on the server, which can be fixed-width or varying width. When you insert data into a CLOB column using a varying-width character set, Oracle converts the data into Unicode before storing it in the database."
http://docs.oracle.com/cd/B10500_01/server.920/a96524/c13datyp.htm#3234
We have a report (simple text) to store in Oracle , average case will be less than 4K but some cases exceeding that. So an option is to use CLOB.
It is for logging purposes only, not used in query or update. Only insert once and retireve few times.
Space and overall schema (other tables) performance is of main concern.
I read about CLOB storage allocation format.
We are contemplating using 2 columns, msgV varchar2(4000) and msgC CLOB. When text exceeds 4k then we store into CLOB otherwise usual varchar2 and CLOB remains NULL.
So my question is,
Is this scheme better w.r.t to above performance consideration or simply use CLOB ?
(apart from it entailing more coding work maintaining this condition everywhere)
And what is the space consumed by NULL and Empty CLOBs (or any datatype) ?
Use a clob. If the data in the clob is under 4000 bytes, it will actually be stored inline. See section in link below about LOB's compared to LONG.
Oracle Lob
I create a program to insert Large file into database (around 10M). I choosed BLOB type for objects column in my table.
Now I read BLOB just support binary object with maximoum lengh of 4M.
would you advice me what can I do in this case for upload those object more than 4M?
I am useing Oracle 9i or 10g.
You read something that appears to be incorrect.
Per the Oracle 10g Release 2 documentation:
The BLOB datatype stores unstructured binary large objects. BLOB objects
can be thought of as bitstreams with no character set semantics. BLOB
objects can store binary data up to (4 gigabytes -1) * (the value of the
CHUNK parameter of LOB storage).
If the tablespaces in your database are of standard block size, and if you
have used the default value of the CHUNK parameter of LOB storage when
creating a LOB column, then this is equivalent to (4 gigabytes - 1) *
(database block size).
The maximum size of a LOB supported by
the database is equal to the value of
the db_block_size initialization
parameter times the value 4294967295.
This allows for a maximum LOB size
ranging from 8 terabytes to 128
terabytes.
http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_lob.htm#i1016062
According to this site,
http://ss64.com/ora/syntax-datatypes.html
BLOB has a max size of 4GB since Oracle 8, so 10MB should be no problem.