In my business case, I need insert one row and can't use batch insert . So I want to know what the throughput can made by Oracle. I try these ways:
Effective way
I use multi-thread, each thread owns one connection to insert data
I use ssd to store oracle datafile
Ineffective way
I use multi table to store data in one schema
I use table partition
I use multi schema to store data
Turn up data file block size
Use append hint in insert SQL
In the end the best TPS is 1w/s+
Other:
Oracle 11g
Single insert data size 1k
CPU i7, 64GB memory
Oracle is highly optimized for anything from one row inserts to batches of hundreds of rows. You do not mention whether you are having performance problems with this one row insert nor how long the insert takes. For such a simple operation, you don't need to worry about any of those details. If you have thousands of web-based users inserting one row into a table every minute, no problem. If you are committing your work at the appropriate time, and you don't have a huge number of indexes, a single row insert should not take more than a few milliseconds.
In SQL*Plus try the commands
set autotrace on explain statistics
set timing on
and run your insert statement.
Edit your question to include the results of the explain plan. And be sure to indent the results 4 spaces.
Related
In Oracle I came across two types of insert statement
1) Insert All: Multiple entries can be inserted using a single sql statement
2) Insert : One entry will be updated per insert.
Now I want to insert around 100,000 records at a time. (Table have 10 fields with includes a primary key). I am not concerned about any return value.
I am using oracle 11g.
Can you please help me with respect to performance which is better "Insert" or "Insert All".
I know this is kind of a Necro but it's pretty high on the google search results so I think this is a point that worth making.
Insert All can give dramatic performance benefits if you are building a web application because it is a single SQL statement that requires only one round trip to your database. In most cases although far from all cases. the majority of the cost of a query is actually latency. Depending on what framework you are using, this syntax can help you avoid unnecessary round trips.
This might seem incredibly obvious but I have seen many, many production web applications in large companies that have forgotten this simple fact.
Insert statement and insert all statement are practically the same conventional insert statement. insert all, which has been introduced in 9i version simply allows you to do insertion into multiple tables using one statement. Another type of insert that you could use to speed up the process is direct-path insert - you use /*+ append*/ or /*+ append_values*/(Oracle 11g) hints
insert /*+ append*/ into some_table(<<columns>>)
select <<columns or literals>>
from <<somwhere>>
or (Oracle 11g)
insert /*+ append_values*/ into some_table(<<columns>>)
values(<<values>>)
to tell Oracle that you want to perform direct-path insert. But, 100K rows it's not that many rows and conventional insert statement will do just fine. You wont get significant performance advantage using direct-path insert with that amount of data. Moreover direct-path insert wont reuse free space, it adds new data after HWM(high water mark), hence require more space. You wont be able to use select statement or other DML statement, if you has not issued commit.
To use FORALL you would need PLSQL tables.
This process is quite fast.
You can also choose the table to have NO LOG option which would speed the process up during inserts.
Our application manages a table containing a per-user set of rows that is the
result of a computationally-intensive query. Storing this result in a table
seems a good way of speeding up further calculations.
The structure of that table is basically the following:
CREATE TABLE per_user_result_set
( user_login VARCHAR2(N)
, result_set_item_id VARCHAR2(M)
, CONSTRAINT result_set_pk PRIMARY KEY(user_login, result_set_item_id)
)
;
A typical user of our application will have this result set computed 30 times a
day, with a result set consisting of between 1 single items and 500,000 items.
A typical customer will declare about 500 users into the production database.
So, this table will typically consist of 5 million rows.
The typical query that we use to update this table is:
BEGIN
DELETE FROM per_user_result_set WHERE user_login = :x;
INSERT INTO per_user_result_set(...) SELECT :x, ... FROM ...;
END;
/
After having run into performance issues (the DELETE part would take much time)
we decided to have a GLOBAL TEMPORARY TABLE (on commit delete rows) to hold a
“delta” of rows to suppress from the table and rows to insert into it:
BEGIN
INSERT INTO _tmp
SELECT ... FROM ...
MINUS SELECT result_set_item_id
FROM per_user_result_set
WHERE user_login = :x;
DELETE FROM per_user_result_set
WHERE user_login = :x
AND result_set_item_id NOT IN (SELECT result_set_item_id
FROM _tmp
);
INSERT INTO per_user_result_set
SELECT :x, result_set_item_id
FROM _tmp;
COMMIT;
END;
/
This has improved performance a bit, but still this is not satisfactory. So
we're exploring ways to speed up that process and here are the issues that
we experience:
We would have loved to use table partitioning (partitioning by user_login).
But partitioning is not always available (on our test databases we hit
ORA-00439). Our customers cannot all afford Oracle Enterprise Edition with
paid additional features.
We could make the per_user_result_set table GLOBAL TEMPORARY, so that it
is isolated and we can TRUNCATE it for example… but our application
sometimes loses connection to Oracle due to network problems, and will
automatically reconnect. By that time we lose the contents of our
computation.
We could split that table into a certain number of buckets, make a view that
UNIONs ALL all those buckets, and triggers INSTEAD OF UPDATE and DELETE on
that view, and repart rows according to ORA_HASH(user_login) % num_buckets.
But we are afraid this could make SELECT operations much slower.
This would result in a constant number of tables, with smaller indexes
affected in DELETE or INSERT operations. In short, “partioning table for the
poor”.
We've tried to ALTER TABLE per_user_result_set NOLOGGING. This does not
improve things much.
We've tried to CREATE TABLE ... ORGANIZATION INDEX COMPRESS 1. This speeds
things up by a ratio of 1:5.
We've tried to have one table per user_login. That's exactly what we could
have by partitioning using a number of partitions equal to the number of
distinct user_logins and a well-chosen hash function. Performance factor is
1:10. But I would really like to avoid this solution: have to maintain a
huge number of indexes, tables, views, on a per-user basis. This would be
an interesting performance gain for the users, but not for us maintainers of
the systems.
Since the users work at the same time there is no way that we create a new
table and swap it with the old one.
What could you please suggest in complement to these approaches?
Note. Our customers run Oracle Databases from 9i to 11g, and XE editions to
Enterprise edition. That's a wide variety of versions that we need to be
compatible with.
Thanks.
We've tried to have one table per user_login. That's exactly what we
could have by partitioning using a number of partitions equal to the
number of distinct user_logins and a well-chosen hash function.
Performance factor is 1:10. But I would really like to avoid this
solution: have to maintain a huge number of indexes, tables, views, on
a per-user basis. This would be an interesting performance gain for
the users, but not for us maintainers of the systems.
Can you then make a stored procedure to generate these table on a per-user basis? Or, better yet, have this stored procedure do the most appropriate thing depending on the licensure of Oracle being supported?
If Partitioning option
then create or truncate user-specific list partition
Else
drop user-specific result table
Create user-specific result table
as Select from template result table
create indexes
create constraints
perform grants
end if
Perform insert
If all your users were on 11g Enterprise Edition I would recommend you to use Oracle's built-in result-set caching rather than trying to roll your own. But that is not the case, so let's move on.
Another attractive option might be to use PL/SQL collections rather than tables. Being in memory these are faster to retrieve and require less maintenance. They are also supported in all the versions you need. However, they are session variables, so if you have lots of users with big result sets that would put stress on your PGA allocations. Also their data would be lost when the network connection drops. So that's probably not the solution you're looking for.
The core of your problem is this statement:
DELETE FROM per_user_result_set WHERE user_login = :x;
It's not a problem in itself but you have extreme variations in data distribution. Bluntly, the deletion of a single row is going to have a very different performance profile from the deletion of half a million rows. And because your users are constantly refreshing their data there is no way you can handle that, except by giving your users their own tables.
You say you don't want to have a table per user because
"[it] would be an interesting performance gain for the users, but not
for us maintainers of the systems,"
Systems exist for the benefit of our users. Convenience for us is great as long as it helps us to provide better service to them. But their need for a good working experience trumps ours: they pay the bills.
But I question whether having individual tables for each user really increases the work load. I presume each user has their own account, and hence schema.
I suggest you stick with index-organized tables. You only need columns which are in the primary key and maintaining a separate index is unnecessary overhead (for both inserting and deleting). The big advantage of having a table per user is that you can use TRUNCATE TABLE in the refresh process, which is a lot faster than deletion.
So your refresh procedure will look like this:
BEGIN
TRUNCATE TABLE per_user_result_set REUSE STORAGE;
INSERT INTO per_user_result_set(...)
SELECT ... FROM ...;
DBMS_STATS.GATHER_TABLE_STATS(user
, 'PER_USER_RESULT_SET'
, estimate_percent=>10);
COMMIT;
END;
/
Note that you don't need to include the USER column any more, so yur table will just have the single column of result_set_item_id (another indication of the suitability of IOT.
Gathering the table stats isn't mandatory but it is advisable. You have a wide variability in the size of result sets, and you don't want to be using an execution plan devised for 500000 rows when the table has only one row, or vice versa.
The only overhead is the need to create the table in the user's schema. But presumably you already have some set-up for a new user - creating the account, granting privileges, etc - so this shouldn't be a big hardship.
I have a cursor that selects all rows in a table, a little over 500,000 rows. Read a row from cursor, INSERT into other table, which has two indexes, neither unique, one numeric, one 'DATE' type. COMMIT. Read next row from Cursor, INSERT...until Cursor is empty.
All my DATE column's values are the same, from a timestamp initialized at the start of the script.
This thing's been running for 24 hours, only posted 464K rows, a little less than 10K rows / hr.
Oracle 11g, 10 processors(!?)
Something has to be wrong. I think it's that DATE index trying to process all these entries with exactly the same value for that column.
Why don't you just do:
insert into target (columns....)
select columns and computed values
from source
commit
?
This slow by slow is doing far more damage to performance than an index that may not make any sense.
Indexes slow down inserts but speed up queries. This is normal.
If it is a problem you can remove the index, insert the rows, then add the index again. This can be faster if you are doing many inserts at once.
The way you are copying the data using cursors seems to be inefficient. You could try a set-based approach instead:
INSERT INTO table1 (x, y, z)
SELECT x, y, z FROM table2 WHERE ...
Committing after every inserted row doesn't make much sense. If you're worried about exceeding undo capacity, for example, you can keep a count of the inserts and issue a commit after every thousand rows.
Updating the indexes will have some impact but that's unavoidable if you can't drop (or disable) while the inserts are performed, but that's just how it goes. I'd expect the commits to have a bigger impact, though I suspect that's a topic with varied opinions.
This assumes you have a good reason for inserting from a cursor rather than as a direct insert into ... select from model.
In general, its often a good idea to delete the indexes before doing a massive insert and then add them back afterwards, so that the db doesnt have to try to update the indexes with each insert. Its been a long while since I've used oracle, but had you tried putting more than one insert statement in a transaction? That should also speed it up.
For operations like this you should look at oracle bulk operations, using FORALL and BULK COLLECT. It will reduce the number of DDL operations on the underlying tables considerably
create or replace procedure fast_proc is
type MyTable is table of source_table%ROWTYPE;
MyTable table;
begin
select * BULK COLLECT INTO table from source_table;
forall x in table.First..table.Last
insert into dest_table values table(x) ;
end;
Agreed on comment that what is killing your time is the 'slow by slow' processing. Copying 500,000 rows should be a matter of minutes.
The single INSERT ... SELECT FROM .... approach would be the best one, provided you have big enough Rollback segments. The database may even automatically apply parallel techniques to a plain SQL statement that it will not do with PL/SQL.
In addition you could look at using the /*+ APPEND */ hint - read up on it and see if it may apply to the situation with your target table.
o use all 10 cores you will need to either use plain parallel SQL, or run 10 copies of your pl/sql block, splitting the source table across the 10 copies.
In Oracle 10 this is a manual task (roll your own parallelism) but Oracle 11.2 introduces DBMS_PARALLEL_EXECUTE.
Failing that, bulking up your fetch / insert using the BULK COLLECT & bulk insert would be the next best option - process in chunks of 1000 or so rows (or larger). Again take a look as to whether DBMS_PARALLEL_EXECUTE may help you, or if you could submit the job in chunks via DBMS_JOB.
(Caveat : I don't have access to anything later than Oracle 10)
We're trying to figure out the best way to handle BULK INSERTs using Oracle (10gR2), and I'm finding that it can be a pretty complicated subject. One method that I've found involves using the Append optimizer hint:
INSERT /*+ Append*/
INTO some_table (a, b)
VALUES (1, 2)
My understanding is that this will tell Oracle to ignore indexes and just put the results at the end of the table. Then, all I should have to do is rebuild the indexes:
ALTER INDEX some_index REBUILD
This would be easier than trying to launch SQL*Loader as an external process or doing some pl/SQL. This almost seems too easy. Is there something I'm missing? Any things that could come back to bite me if I take this approach?
A few notes ...
A single row cannot be appended, therefore APPEND is only valid with INSERT INTO ... SELECT FROM syntax.
An append is the addition of data above the high water mark of the table, in which the data is formatted into complete blocks that are then written to the table and which bypass the SQL engine
An append in parallel mode requires that each parallel query thread allocate at least one new extent to the table, into which the new blocks are written. This can be wasteful of space.
The indexes are not ignored, but maintenance of them is defered until the blocks have been written into the table.
See he docs for more important information: http://download.oracle.com/docs/cd/B19306_01/server.102/b14231/tables.htm#ADMIN01509
I am writing a data conversion in PL/SQL that processes data and loads it into a table. According to the PL/SQL Profiler, one of the slowest parts of the conversion is the actual insert into the target table. The table has a single index.
To prepare the data for load, I populate a variable using the rowtype of the table, then insert it into the table like this:
insert into mytable values r_myRow;
It seems that I could gain performance by doing the following:
Turn logging off during the insert
Insert multiple records at once
Are these methods advisable? If so, what is the syntax?
It's much better to insert a few hundred rows at a time, using PL/SQL tables and FORALL to bind into insert statement. For details on this see here.
Also be careful with how you construct the PL/SQL tables. If at all possible, prefer to instead do all your transforms directly in SQL using "INSERT INTO t1 SELECT ..." as doing row-by-row operations in PL/SQL will still be slower than SQL.
In either case, you can also use direct-path inserts by using INSERT /*+APPEND*/, which basically bypasses the DB cache and directly allocates and writes new blocks to data files. This can also reduce the amount of logging, depending on how you use it. This also has some implications, so please read the fine manual first.
Finally, if you are truncating and rebuilding the table it may be worthwhile to first drop (or mark unusable) and later rebuild indexes.
Regular insert statements are the slowest way to get data in a table and not meant for bulk inserts. The following article references a lot of different techniques for improving performance: http://www.dba-oracle.com/oracle_tips_data_load.htm
Drop the index, then insert the rows, then re-create the index.
If dropping the index doesn't speed things up enough, you need the Oracle SQL*Loader:
http://www.oracle.com/technology/products/database/utilities/htdocs/sql_loader_overview.html
Suppose you have taken eid,ename,sal,job. So create a table first as:
SQL>create table tablename(eid number, ename varchar2(20),sal number,job char(10));
Now insert data:-
SQL>insert into tablename values(&eid,'&ename',&sal,'&job');
Check this link
http://www.dba-oracle.com/t_optimize_insert_sql_performance.htm
main points to consider for your
case is to use Append hint as this
will directly append into the table
instead of using freelist. If you can afford to turn off logging than use append with nologging hint to do it
Use a bulk insert instead instead of iterating in PL/SQL
Use sqlloaded to load the data directly into the table if you are getting data from a file feed
Here are my recommendations on fast insert.
Trigger - Disable any triggers associated with a table. Enable after Inserts are complete.
Index - Drop Index and re-create it after your Inserts are complete.
Stale stats - Re-analyze table and index stats.
Index de-fragmentation - Rebuild Index if needed
Use No Logging -Insert using INSERT APPEND (Oracle only). This approach is very risky approach, no redo logs are generated therefore you can’t do a rollback - make a backup of table before you start and don't try on live tables. Check if your db has similar option
Parallel Insert: Running parallel insert will get the job faster.
Use Bulk Insert
Constraints - Not much overhead during inserts but still a good idea to check, if it is still slow after even after step 1
You can learn more on http://www.dbarepublic.com/2014/04/slow-insert.html
Maybe one of your best option is to avoid Oracle as much as possible actually.
I've been baffled by this myself, but very often a Java process can outperform many of the Oracle's utilities which either use OCI (read: SQL Plus) or will take up so much of your time to get right (read: SQL*Loader).
This doesn't prevent you to use specific hints either (like /APPEND/).
I've been pleasantly surprised each time I've turned to that kind of solution.
Cheers,
Rollo