Migration issue while using DMS . Incorrect junk data for empty columns

Migration issue while using DMS . Incorrect junk data for empty columns - oracle

While migrating from MySQL to ORAcle using AWS DMS servcie, In the source side(MySQL DB instance), some huge column (mediumtext) values are empty for 75% of rows in a table. Whereas in the target (Oracle ), its migrated with some other value (Not Junk values) . For me it looks like the column values are copied incorrectly between rows.
Wherever there is empty values in the source side columns, it copied some other data. Around 75% of table data for some of the clob columns with empty values in source side, are incorrectly mapped with some other data in the oracle side. We used FULL LOB mode and 10000Kb as chunk size.

Some questions or requests -
1. Could you share the table DDL from source and target?
2. Are you sure there is no workload running on the target that could change values in the table outside the DMS process?
3. Full LOB mode migrates LOBs in chunks. Why are we specifying such a high LOB chunk size? Also, do we not know the max LOB size to use limited LOB mode.
4. Could you paste the task ARN here? I work for AWS DMS and can look to see what is going on? Once I find the root cause, I will also make sure I post an analysis here for all stackoverflow users.
Let me know.

Related

Full table scan behaviour with cache and nocache in oracle 12c

I have a same query running on two different DB servers with almost identical config. Query is doing Full Table scan(FTS) on one table
SELECT COUNT (1) FROM tax_proposal_dtl WHERE tax_proposal_no = :b1 AND taxid != :b2 AND INSTR(:b3 , ',' || STATUS || ',' ) > 0 ;
While on 1st DB I get result in less than 3 secs with 0 disk read while on 2nd DB disk read is high and elapsed time is approx 9 secs
Only difference between the table config on two DBs is that on 1st Table has Cache = 'Y' while on 2nd Cache = 'N'. As per my understanding is that in case of FTS cache wont be used and direct path read will be used. so, why is the performance of same query is impacted by cache/nocache(Because that is the only difference between two envs and even the execution plan is same).
As suggested by Jon and after doing further research on this topic(Specially with regards to _SMALL_TABLE_THRESHOLD), I am adding more details.
Current version: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit
Details of 2nd DB:
Total block count of table from DBA_SEGMENTS = 196736
Details of 1st DB:
Total block count of table from DBA_SEGMENTS = 172288
Execution plan on both the DBs are same but there are two major differences :
a) On 2nd DB cache option is false on the table(I tried alter table cache but still no impact on performance)
b) On 2nd DB because _STT parameter is 23920 so as per 5*_STT rule table will not be qualified as medium sized table while on 1st DB _STT parameter is 48496 so as per 5*_STT rue table will be qualified as medium sized table.
Below is a chart based on my research till now on _STT an Cache parameter of how system will behave for different table size.
Please let me know if my understanding is correct in assuming that Cache option will have no impact on Medium or Large sized table but it will help in retaining small sized table longer in LRU. So based on above assumptions and chart presented I am concluding that in the case of 2nd DB Table is classified as Large sized table and hence DPR and more elapsed time while in the case of 1st it is classified as medium sized table and hence cache read and less elapsed time.
As per this link I have set the _STT parameter on session on 2nd DB
alter session set "_small_table_threshold"=300000;
So, performance has improved considerably and almost same as 1st DB with 0 disk reads, as this implies that table will be considered Small sized.
I have used following articles in my research.
https://jonathanlewis.wordpress.com/2011/03/24/small-tables/
https://hoopercharles.wordpress.com/2010/06/17/_small_table_threshold-parameter-and-buffer-cache-what-is-wrong-with-this-quote/?unapproved=43522&moderation-hash=be8d35c5530411ff0ca96388a6fa8099#comment-43522
https://dioncho.wordpress.com/tag/full-table-scan/
https://mikesmithers.wordpress.com/2016/06/23/oracle-pinning-table-data-in-the-buffer-cache/
http://afatkulin.blogspot.com/2012/07/serial-direct-path-reads-in-11gr2-and.html
http://afatkulin.blogspot.com/2009/01/11g-adaptive-direct-path-reads-what-is.html

The keywords CACHE and NOCACHE are a bit misleading - they don't simply enable or disable caching, they only make cache reads more or less likely by changing how the data is stored in the cache. Like most memory systems, the Oracle buffer cache is constantly adding new data and aging out old data. The default, NOCACHE, will still add table data from full table scans to the buffer cache, but it will mark it as the first piece of data to age out.
According to the SQL Language Reference:
CACHE
For data that is accessed frequently, this clause indicates
that the blocks retrieved for this table are placed at the most
recently used end of the least recently used (LRU) list in the buffer
cache when a full table scan is performed. This attribute is useful
for small lookup tables.
...
NOCACHE
For data that is not accessed
frequently, this clause indicates that the blocks retrieved for this
table are placed at the least recently used end of the LRU list in the
buffer cache when a full table scan is performed. NOCACHE is the
default for LOB storage.
The real behavior can be much more complicated. The in-memory option, result caching, OS and SAN caching, direct path reads (usually for parallelism), the small table threshold (where Oracle doesn't cache the whole table if it exceeds a threshold), and probably other features I can't think of may affect how data is cached and read.
Edit: I'm not sure if I can add much to your analysis. There's not a lot of official documentation around these thresholds and table scan types. Looks like you know as much about the subject as anyone else.
I would caution that this kind of full table scan optimization should only be needed in rare situations. Why is a query frequently doing a full table scan of a 1GB table? Isn't there an index or a materialized view that could help instead? Or maybe you just need to add more memory if you need the development environment to match production.
Another option, instead of changing the small table threshold, is to change the perceived size of the table. Modify the statistics so that Oracle thinks the table is small. This way no other tables are affected.
begin
dbms_stats.set_table_stats(ownname => user, tabname => 'TAX_PROPOSAL_DTL', numblks => 999);
dbms_stats.lock_table_stats(ownname => user, tabname => 'TAX_PROPOSAL_DTL');
end;
/

Oracle - clean LOB files - recovering disk space

I have a friend who has a website and asked me for help.
I often use MySQL databases but never Oracle databases.
And unfortunately he has an Oracle database, so I can't find a solution.
The available disk space is slowly decreasing... I delete a lot of lines from the table but that doesn't solve his problem.
The database continues to take up disk space slowly.
I read that LOB files do not return disk space, even if you delete data.
How can I reorganize LOB files easily with a simple request?
(or/and) How can I recover disk space on Oracle?
SELECT DISTINCT VERSION FROM PRODUCT_COMPONENT_VERSION
12.1.0.1.0

The BLOB column exists within the table blocks along with data even after deletion. It is only marked as unused. You can use the following command to free up space from the BLOB table:
ALTER TABLE <YOUR_TABLE_NAME> MODIFY
LOB <LOB_COLUMN_NAME>
( SHRINK SPACE );
Now, Table must have released some space and it is now available to be used within the tablespace.
Further, you can just alter the data file and reduce the size of the data file accordingly to free up space from Disk. (Note: Space allocated to the data file will not be automatically reduced. It must be done manually)
Cheers!!

SQL Server 2008 R2 Express 10GB Filesize limit

I have reached the file size limit on my SQL Server 2008 R2 Express database which I believe is 10Gb. I know this because I see Event ID 1101 in the event log.
Could not allocate a new page for database 'ExchangeBackup' because of insufficient disk space in filegroup 'PRIMARY'
I have removed some historic data to work around the problem for now but it is only a temporary fix. One table (PP4_MailBackup) is much larger than the others so when I created this database 12 months ago, I converted this table to be a Filestream table and the data is stored outside the FileGroup in the File System. This appeared to be working successfully until I received the error and new data was no longer being added to my database.
When I do a report on table sizes I see the Reserved(KB) column adds up to almost 10GB.
The folder that holds my FileStream data is 176 GB
The database .mdf file is indeed 10GB.
Does anyone have any idea why the table PP4_MailBackup is still using nearly 7GB?
Here is the "Standard Reports -> Disk Usage report" for this database:
Thanks in advance
David
Update
Here is some more info.
There are 868,520 rows in this table.
This cmd returns 1 so I'm assuming Ansipadding is on. I have never changed this from the default.
SELECT SESSIONPROPERTY('ANSI_PADDING')
The columns are defined like this
Even if every record for every column filled the full record size, by my rough calculation the table would be around 4,125,470,000 bytes. I understand that the nvarchar columns only use the actual space required.
I'm still missing a lot of space.

Not really an answer but more of a conclusion.
I have given up on this problem and resided myself to remove data to stay under the 10GB Primary file size limit. I figured out that the nvarchar columns store 2 bytes per character in order to deal with Unicode characters although they do only use the space required and don't pad out the column with spaces. So this would account for some of the space I can't find.
I tried to convert my char(500) columns to varchar(500) by adding new columns with the correct type copying data into them and then removing the old column. This worked but the table actually got bigger because removing the column is only a Meta data change and does not actually remove the data. To recover the space I would need to create a new table and copy the data across then remove the old table of course I don't have enough space in the primary file to do that.
I thought about copying the table to temp db removing the original table then copying it back but temp db doesn't support filestream columns (at least to my knowledge) so I would need to hold all 170GB within the temp db table. This sounded like a dubious solution and my test server didn't have enough space on the partition where temp db was stored. I couldn't find anything on the files size limit of tempdb on sql 2008 Express, but at this point it was all getting too hard.

Storing table data as blob in column in different table(Oracle)

Requirement : We have around 500 tables from which around 10k rows in each tables are of interest. We want to store this data as blob in a table. All data when exported to a file is of 250 MB. Now one option is to store this 250 MB file in a blob (Oracle allows 4 GB) or store each table data as blob in a blob column i.e we will have one row for each table and blob column will have that table data.
Now with respect to performance, which option is better in terms of performance. Also this data needs to be fetched and insert into database.
Basically, this will be delivered to customer and our utility will read the data from blob and will insert into database.
Questions:
1) How to insert table data as blob in blob column
2) How to read from that blob column and then prepare insert statements.
3) Is there any benefit we can get from compression of table which contains blob data. If yes, then for reading how to uncompress that.
4) Does this approach will work on MSSQL and DB2 also.
What are the other considerations while designing tables having blob.
Please suggest

I have impression you want to go from structured content to non-structured.
I hope you know what you are trading off, but I do not have that impression reading your question.
Going BLOB you lose relationship / constraints between values.
It could be faster to read one block of data, but when you need to write minor change, you may need to write bigger "chunk" in case of big BLOBs.
To insert BLOB in database you can use any available API (OCI, JDBC. Even pl/sql if you access it only on server side).
For compression, you can use BLOB option. Also, you can DIY using some library (if you need to think about other RDBMS types).

Why do you want to store a table into a BLOB? For archive or transfer you could export the tables using exp or perferablyl expdp. These files you can compress and transfer or store as BLOB inside another Oracle database.
Max. size of LOB was 4 GB till Oracle release 9 as far as I remember. Today the limit is 8 TB to 128 TB, depending on your DB-Block Size.

Compress Oracle table

I need to compress a table. I used alter table tablename compress to compress the table. After doing this the table size remained the same.
How should I be compressing the table?

To compress the old blocks of the table use:
alter table table_name move compress;
This will reinsert the records in another blocks, compressed, and discard old blocks, so you'll gain space. And invalidates the indexex, so you will need to rebuild them.

Compress does not affect already stored rows. Please, check the official documentation:
" You specify table compression with the COMPRESS clause of
the CREATE TABLE statement. You can enable compression for an existing
table by using this clause in an ALTER TABLEstatement. In this case,
the only data that is compressed is the data inserted or updated after
compression is enabled..."
ALTER TABLE t MOVE COMPRESS is a valid answer. But if you use different non default options, especially with big data volume, do regression tests before using ALTER TABLE ... MOVE.
There were historically more problems (performance degradations and bugs) with it. If you have access, look Oracle bug database to see if there are known problems for features and version you use.)
You are on safer side if you: create new table insert data from original (old) table drop old table rename new table to old table name

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio