Free Space in Table - oracle

we have a table with over 500,000,000 row in it. So, did some maintenance and moved some of those records elsewhere. What I'm curious about is how much space is available in that table now that more than 1/2 of the rows have been deleted. When I look in the dictionary, it still shows 18GB, of which there must be a lot of room unallocated now.
I tried running DBMS_SPACE:
VARIABLE total_blocks NUMBER
VARIABLE total_bytes NUMBER
VARIABLE unused_blocks NUMBER
VARIABLE unused_bytes NUMBER
VARIABLE lastextf NUMBER
VARIABLE last_extb NUMBER
VARIABLE lastusedblock NUMBER
exec DBMS_SPACE.UNUSED_SPACE('CUSTOMER', 'LOGIN_LOG', 'TABLE', :total_blocks, :total_bytes,:unused_blocks, :unused_bytes, :lastextf, :last_extb, :lastusedblock);
However, the FREE_SPACE item is NULL:
SQL> print
FREE_BLOCKS
-----------
TOTAL_BLOCKS
------------
2436608
TOTAL_BYTES
-----------
1.9961E+10
So, just wondering if I am missing something, or there is a different way of doing this?

You have freed up space, but that space is still allocated (reserved) for that table. So as that table continues to grow, it will reuse that deleted space.
Hence the total size of the table remains unchanged - just that it is unlikely to grow any further until all of the deleted space get reused by new data.
If you are not expecting the table to get any more rows, then you can reorganise the table (shuffle the rows around) in order to reclaim that space back to the tablespace so that other objects can take advantage of it.
To do that you can do (depending your machine):
alter table T move online;
or
alter table T enable row movement;
alter table T shrink space;

Related

Why does my cursor based copying of a table leave the undo tablespace completely occupied even after finishing the procedure?

I am copying a lot of data from one Table to another in a procedure while using cursors to iterate over the data of one table, holding them in an array and befilling the other afterwards in a limited batchsize.
I do realize that there are better ways to do this but i devloped it this way because of language restrictions. To my Code:
PROCEDURE copy_tableA_into_tableB IS
TYPE tableA_array IS TABLE OF tableA%ROWTYPE;
tableA_initialized_array TableIwantToCopyFrom_array;
CURSOR table_a_cursor IS
SELECT *
FROM TABLE_A; --get all data from Table_A
BEGIN
OPEN table_a_cursor;
LOOP
FETCH table_a_cursor BULK COLLECT
INTO test_copy LIMIT 10000;
FORALL i IN 1 .. table_a_cursor.COUNT
INSERT INTO TABLE_B
VALUES tableA_initialized_array (i);
COMMIT;
EXIT WHEN table_a_cursor%NOTFOUND;
END LOOP;
CLOSE table_a_cursor;
END copy_tableA_into_tableB;
This Code works just fine by just one weird thing that is happening,
when i execute it a couple of times with a bulk of data like 1.5 million table rows the UNDO tablespace is getting bigger and bigger and even though the procedure is done its still allocated by my procedure. Eventually my UNDO tablespace is full and i get an Exception. In fact i can only drop my UNDO tablespace and rebuild it to get it unoccupied and emptry again.
As you can clearly see i am commiting actually every time i got though the array so why would the UNDO table space even still be allocated after the transaction is done?
I am not an oracle expert to understand the underlying concept but i thought my cursor is closed and deallocated when hes closed, so i don't think that hes the culprit, i am using Oracle Version 11g if this is of any concern.
i expected that when i am done with the Procedure that my UNDO table space is deallocated again and i am checking the undo table space that is still left untouched
EDIT for Questions unanswered:
I am checking how much UNDO tablespace i have left the numbers of data to occupation added up and i am the only one running programms,
Exception: ORA-01555: snapshot too old: rollback segment number with name "" too small
I am testing the procedure in the PL/SQL developer under a stored procedure test
I am not reseting anything just emptying the tables that i wanna copy into with a truncate.
First off, I don't understand what you mean by "language restrictions". Why would that prevent you from doing a straight insert?
Secondly, you're commiting across a loop - seriously, a bad, bad idea. You could find yourself getting snapshot too old errors.
Thirdly, for your answer, Oracle doesn't immediately free up space in UNDO, once you've finished with it - it's marked as no longer needed (something that you do with that commit, hence why fetching across a loop is a bad idea!) and if another session needs that space then it's overwritten.
Tom Kyte, as ever, says it best
Is it not because the opening statement is:
CURSOR table_a_cursor IS
SELECT *
FROM TABLE_A; --get all data from Table_A
Therefore undo is generated so that this query can succeed. The lock on these rows isn't released until after it has completed (by which time the UNDO tablespace is full).
Something like this should work: (some modification to the loop may be required - I personally would check against 'x' processed before commited rather than this method.
PROCEDURE Copy_tableA_into_tableB AS
l_count integer
BEGIN
-- Count rows in tableA
l_count_total := 'select count(*) from table A';
EXECUTE IMMEDIATE l_sql_regexp_count
INTO l_pancount;
WHILE l_count > 0
LOOP
FETCH table_a_cursor BULK COLLECT
INTO cdplzstb_copy_batch LIMIT 10000;
FORALL i IN 1 .. table_a_cursor.COUNT
INSERT INTO TABLE_B
VALUES tableA_initialized_array (i);
COMMIT;
l_count:=l_count-1;
END LOOP;
END Copy_tableA_into_tableB;
Oracle will not reuse space while there is an active transaction there.
that explain why your tablespace completely occupied
More info in this Undo tablespace keeps growing
If you want to know how much space is occuipied
select tablespace_name,sum(bytes) from dba_segments group by tablespace_name;
this way you can find the actual size
select tablespace_name,sum(bytes) from dba_data_files group by tablespace_name;

Find table name from v$datafile . name colum

When you look at wait events (i.e. with Toad), you see a file# parameter.
How can I get more useful information as the table name.
Is it possible to know even the number of records that are read by that table?
In another forum I found this advice, but it doesn't seem to work.
select segment_name
from dba_extents ext
where ext.file_id = 828
and 10711 between ext.block_id and ext.block_id + ext.blocks - 1
and rownum = 1
Let's talk files, blocks, segments and extents.
A segment is a database object that is stored. It may be a table, index, (sub)partition, cluster or LOB. Mostly you'll be interested in tables and indexes.
A segment is made up of extents. If you think of a segment as a book, an extent is a chapter. A segment (generally) starts with at least one extent. When it needs to store more data and it doesn't have room in the existing extents, it adds another extent to the segment.
An extent lives in a datafile. A datafile can have lots of extents each starting at a different point in the file and having a size. You may have one extent of 15 blocks starting in file 1 at block 10.
A wait event should identify the file and block (and row). If your wait event is for file #1 and block 12 you go off to USER_EXTENTS (or DBA_EXTENTS) and look for the extent in file# 1 where 12 is between the starting block location and the starting block location plus the number of blocks. So block 12 would between starting block 10 and end block 25 (start plus size).
Once you've identified the extent, you track it back to its parent segment (USER_SEGMENTS / DBA_SEGMENTS) which will give you the table/index name.
A theoretical SQL is as follows :
select username, sid, serial#,
row_wait_obj#, row_wait_file#, row_wait_block#, row_wait_row#,
ext.*
from v$session s
join dba_extents ext on ext.file_id = row_wait_file#
and row_wait_block# between ext.block_id and ext.block_id + ext.blocks - 1
where username = 'HR'
and status = 'ACTIVE'
For this one I purposefully blocked a session so that it was waiting on a row lock.
828 is a rather large file id. It isn't impossible, but it is unusual. Do a select from DBA_DATA_FILES and see if you have such a file. If not, and you've only got a few files, look at all the objects that match the "10711 between ext.block_id and ext.block_id + ext.blocks - 1" criteria without the file id. You should be able to find a likely candidate from there.
The exception is if the problem was on a temporary segment. Since these get disposed of at the end of the operation, there's no permanent object recorded. In that cases the 'name' of the table/index isn't applicable and you need to tackle any performance issue another way (eg look at the SQL and its explain plan and work out whether it is correct in using lots of temp space).

select only new row in oracle

I have table with "varchar2" as primary key.
It has about 1 000 000 Transactions per day.
My app wakes up every 5 minute to generate text file by querying only new record.
It will remember last point and process only new records.
Do you have idea how to query with good performance?
I am able to add new column if necessary.
What do you think this process should do by?
plsql?
java?
Everyone here is really really close. However:
Scott Bailey's wrong about using a bitmap index if the table's under any sort of continuous DML load. That's exactly the wrong time to use a bitmap index.
Everyone else's answer about the PROCESSED CHAR(1) check in ('Y','N')column is right, but missing how to index it; you should use a function-based index like this:
CREATE INDEX MY_UNPROCESSED_ROWS_IDX ON MY_TABLE
(CASE WHEN PROCESSED_FLAG = 'N' THEN 'N' ELSE NULL END);
You'd then query it using the same expression:
SELECT * FROM MY_TABLE
WHERE (CASE WHEN PROCESSED_FLAG = 'N' THEN 'N' ELSE NULL END) = 'N';
The reason to use the function-based index is that Oracle doesn't write index entries for entirely NULL values being indexed, so the function-based index above will only contain the rows with PROCESSED_FLAG = 'N'. As you update your rows to PROCESSED_FLAG = 'Y', they'll "fall out" of the index.
Well, if you can add a new column, you could create a Processed column, which will indicate processed records, and create an index on this column for performance.
Then the query should only be for those rows that have been newly added, and not processed.
This should be easily done using sql queries.
Ah, I really hate to add another answer when the others have come so close to nailing it. But
As Ponies points out, Oracle does have a hidden column (ORA_ROWSCN - System Change Number) that can pinpoint when each row was modified. Unfortunately, the default is that it gets the information from the block instead of storing it with each row and changing that behavior will require you to rebuild a really large table. So while this answer is good for quieting the SQL Server fella, I'd not recommend it.
Astander is right there but needs a few caveats. Add a new column needs_processed CHAR(1) DEFAULT 'Y' and add a BITMAP index. For low cardinality columns ('Y'/'N') the bitmap index will be faster. Once you have the rest is pretty easy. But you've got to be careful not select the new rows, process them and mark them as processed in one step. Otherwise, rows could be inserted while you are processing that will get marked processed even though they have not been.
The easiest way would be to use pl/sql to open a cursor that selects unprocessed rows, processes them and then updates the row as processed. If you have an aversion to walking cursors, you could collect the pk's or rowids into a nested table, process them and then update using the nested table.
In MS SQL Server world where I work, we have a 'version' column of type 'timestamp' on our tables.
So, to answer #1, I would add a new column.
To answer #2, I would do it in plsql for performance.
Mark
"astander" pretty much did the work for you. You need to ALTER your table to add one more column (lets say PROCESSED)..
You can also consider creating an INDEX on the PROCESSED ( a bitmap index may be of some advantage, as the possible value can be only 'y' and 'n', but test it out ) so that when you query it will use INDEX.
Also if sure, you query only for every 5 mins, check whether you can add another column with TIMESTAMP type and partition the table with it. ( not sure, check out again ).
I would also think about writing job or some thing and write using UTL_FILE and show it front end if it can be.
If performance is really a problem and you want to create your file asynchronously, you might want to use Oracle Streams, which will actually get modification data from your redo log withou affecting performance of the main database. You may not even need a separate job, as you can configure Oracle Streams to do Asynchronous replication of the changes, through which you can trigger the file creation.
Why not create an extra table that holds two columns. The ID column and a processed flag column. Have an insert trigger on the original table place it's ID in this new table. Your logging process can than select records from this new table and mark them as processed. Finally delete the processed records from this table.
I'm pretty much in agreement with Adam's answer. But I'd want to do some serious testing compared to an alternative.
The issue I see is that you need to not only select the rows, but also do an update of those rows. While that should be pretty fast, I'd like to avoid the update. And avoid having any large transactions hanging around (see below).
The alternative would be to add CREATE_DATE date default sysdate. Index that. And then select records where create_date >= (start date/time of your previous select).
But I don't have enough data on the relative costs of setting a sysdate as default vs. setting a value of Y, updating the function based vs. date index, and doing a range select on the date vs. a specific select on a single value for the Y. You'll probably want to preserve stats or hint the query to use the index on the Y/N column, and definitely want to use a hint on a date column -- the stats on the date column will almost certainly be old.
If data are also being added to the table continuously, including during the period when your query is running, you need to watch out for transaction control. After all, you don't want to read 100,000 records that have the flag = Y, then do your update on 120,000, including the 20,000 that arrived when you query was running.
In the flag case, there are two easy ways: SET TRANSACTION before your select and commit after your update, or start by doing an update from Y to Q, then do your select for those that are Q, and then update to N. Oracle's read consistency is wonderful but needs to be handled with care.
For the date column version, if you don't mind a risk of processing a few rows more than once, just update your table that has the last processed date/time immediately before you do your select.
If there's not much information in the table, consider making it Index Organized.
What about using Materialized view logs? You have a lot of options to play with:
SQL> create table test (id_test number primary key, dummy varchar2(1000));
Table created
SQL> create materialized view log on test;
Materialized view log created
SQL> insert into test values (1, 'hello');
1 row inserted
SQL> insert into test values (2, 'bye');
1 row inserted
SQL> select * from mlog$_test;
ID_TEST SNAPTIME$$ DMLTYPE$$ OLD_NEW$$ CHANGE_VECTOR$$
---------- ----------- --------- --------- ---------------------
1 01/01/4000 I N FE
2 01/01/4000 I N FE
SQL> delete from mlog$_test where id_test in (1,2);
2 rows deleted
SQL> insert into test values (3, 'hello');
1 row inserted
SQL> insert into test values (4, 'bye');
1 row inserted
SQL> select * from mlog$_test;
ID_TEST SNAPTIME$$ DMLTYPE$$ OLD_NEW$$ CHANGE_VECTOR$$
---------- ----------- --------- --------- ---------------
3 01/01/4000 I N FE
4 01/01/4000 I N FE
I think this solution should work..
What you need to do following steps
For the first run, you will have to copy all records. In first run you need to execute following query
insert into new_table(max_rowid) as (Select max(rowid) from yourtable);
Now next time when you want to get only newly inserted values, you can do it by executing follwing command
Select * from yourtable where rowid > (select max_rowid from new_table);
Once you are done with processing above query, simply truncate new_table and insert max(rowid) from yourtable
I think this should work and would be fastest solution;

How to delete large data from Oracle 9i DB?

I have a table that is 5 GB, now I was trying to delete like below:
delete from tablename
where to_char(screatetime,'yyyy-mm-dd') <'2009-06-01'
But it's running long and no response. Meanwhile I tried to check if anybody is blocking with this below:
select l1.sid, ' IS BLOCKING ', l2.sid
from v$lock l1, v$lock l2
where l1.block =1 and l2.request > 0
and l1.id1=l2.id1
and l1.id2=l2.id2
But I didn't find any blocking also.
How can I delete this large data without any problem?
5GB is not a useful measurement of table size. The total number of rows matters. The number of rows you are going to delete as a proportion of the total matters. The average length of the row matters.
If the proportion of the rows to be deleted is tiny it may be worth your while creating an index on screatetime which you will drop afterwards. This may mean your entire operation takes longer, but crucially, it will reduce the time it takes for you to delete the rows.
On the other hand, if you are deleting a large chunk of rows you might find it better to
Create a copy of the table using
'create table t1_copy as select * from t1
where screatedate >= to_date('2009-06-01','yyyy-mm-dd')`
Swap the tables using the rename command.
Re-apply constraints, indexs to the new T1.
Another thing to bear in mind is that deletions eat more UNDO than other transactions, because they take more information to rollback. So if your records are long and/or numerous then your DBA may need to check the UNDO tablespace (or rollback segs if you're still using them).
Finally, have you done any investigation to see where the time is actually going? DELETE statements are just another query, and they can be tackled using the normal panoply of tuning tricks.
Use a query condition to export necessary rows
Truncate table
Import rows
If there is an index on screatetime your query may not be using it. Change your statement so that your where clause can use the index.
delete from tablename where screatetime < to_date('2009-06-01','yyyy-mm-dd')
It runs MUCH faster when you lock the table first. Also change the where clause, as suggested by Rene.
LOCK TABLE tablename IN EXCLUSIVE MODE;
DELETE FROM tablename
where screatetime < to_date('2009-06-01','yyyy-mm-dd');
EDIT: If the table cannot be locked, because it is constantly accessed, you can choose the salami tactic to delete those rows:
BEGIN
LOOP
DELETE FROM tablename
WHERE screatetime < to_date('2009-06-01','yyyy-mm-dd')
AND ROWNUM<=10000;
EXIT WHEN SQL%ROWCOUNT=0;
COMMIT;
END LOOP;
END;
Overall, this will be slower, but it wont burst your rollback segment and you can see the progress in another session (i.e. the number of rows in tablename goes down). And if you have to kill it for some reason, rollback won't take forever and you haven't lost all work done so far.

Inserts are 4x slower if table has lots of record (400K) vs. if it's empty

(Database: Oracle 10G R2)
It takes 1 minute to insert 100,000 records into a table. But if the table already contains some records (400K), then it takes 4 minutes and 12 seconds; also CPU-wait jumps up and “Free Buffer Waits” become really high (from dbconsole).
Do you know what’s happing here? Is this because of frequent table extents? The extent size for these tables is 1,048,576 bytes. I have a feeling DB is trying to extend the table storage.
I am really confused about this. So any help would be great!
This is the insert statement:
begin
for i in 1 .. 100000 loop
insert into customer
(id, business_name, address1,
address2, city,
zip, state, country, fax,
phone, email
)
values (customer_seq.nextval, dbms_random.string ('A', 20), dbms_random.string ('A', 20),
dbms_random.string ('A', 20), dbms_random.string ('A', 20),
trunc (dbms_random.value (10000, 99999)), 'CA', 'US', '798-779-7987',
'798-779-7987', 'asdfasf#asfasf.com'
);
end loop;
end;
Here dstat output (CPU, IO, MEMORY, NET) for :
Empty Table inserts: http://pastebin.com/f40f50dbb
Table with 400K records: http://pastebin.com/f48d8ebc7
Output from v$buffer_pool_statistics
ID: 3
NAME: DEFAULT
BLOCK_SIZE: 8192
SET_MSIZE: 4446
CNUM_REPL: 4446
CNUM_WRITE: 0
CNUM_SET: 4446
BUF_GOT: 1407656
SUM_WRITE: 1244533
SUM_SCAN: 0
FREE_BUFFER_WAIT: 93314
WRITE_COMPLETE_WAIT: 832
BUFFER_BUSY_WAIT: 788
FREE_BUFFER_INSPECTED: 2141883
DIRTY_BUFFERS_INSPECTED: 1030570
DB_BLOCK_CHANGE: 44445969
DB_BLOCK_GETS: 44866836
CONSISTENT_GETS: 8195371
PHYSICAL_READS: 930646
PHYSICAL_WRITES: 1244533
UPDATE
I dropped indexes off this table and performance improved drastically even when inserting 100K into 600K records table (which took 47 seconds with no CPU wait - see dstat output http://pastebin.com/fbaccb10 ) .
Not sure if this is the same in Oracle, but in SQL Server the first thing I'd check is how many indexes you have on the table. If it's a lot the DB has to do a lot of work reindexing the table as records are inserted. It's more difficult to reindex 500k rows than 100k.
The indices are some form of tree, which means the time to insert a record is going to be O(log n), where n is the size of the tree (≈ number of rows for the standard unique index).
The fastest way to insert them is going to be dropping/disabling the index during the insert and recreating it after, as you've already found.
Even with indexes, 4 minutes to insert 100,000 records seems like a problem to me.
If this database has I/O problems, you haven't fixed them and they will appear again. I would recommend that you identify the root cause.
If you post the index DDL, I'll time it for a comparison.
I added indexes on id and business_name. Doing 10 iterations in a loop, the average time per 100,000 rows was 25 seconds. This was on my home PC/server all running on a single disk.
Another trick to improve performance is to turn on or set the cache higher on your sequence(customer_seq). This will allow oracle to allocate the sequence into memory instead of hitting the object for each insert.
Be careful with this one though. In some situations this will cause gaps your sequence to have gaps between values.
More information here:
Oracle/PLSQL: Sequences (Autonumber)
Sorted inserts always take longer the more entries there are in the table.
You don't say which columns are indexed. If you had indexes on fax, phone or email, you would have had a LOT of duplicates (ie every row).
Oracle 'pretends' to have non-unique indexes. In reality every index entry is unique with the rowid of the actual table row being the deciding factor. The rowid is made up of the file/block/record.
It is possible that, once you hit a certain number of records, the new ones were getting rowids which meant that had to be fitted into the middle of existing indexes with a lot of index re-writing going on.
If you supply full table and index creation statements, others would be able to reproduce the experience which would have allowed for more evidence based responses.
i think it has to do with the extending the internal structure of the file, as well as building database indexes for the added information - i believe the database arranges the data in a non-linear fashion that helps speed up data retrieval on selects

Resources