Disk reads not reduced while running a query again - oracle

Case 1:
I am running a Sql query in oracle, its a simple select statement on a table with no index. The stats information that I got for the query shows a TABLE FULL ACCESS, 176k buffer_gets and 111k disk_reads. I ran the same query again and checked the stats result, only the time got reduced but there is no change in buffer_gets and disk reads. As the data is cached the time is getting reduced y not buffer and disk reads?
Case 2:
Now I have created an index for the table and ran the same query and saw the stats result, I got TABLE ACCESS BY INDEX and few buffer gets and disk reads. when I ran the same query again I got the same result with zero disk reads and reduction in time.
why disk reads not reduced in case 1? When i run a query what are all gets cached?
As far as I have noticed disk reads remains the same in table access full and joins.

A full scan on a large table generally does not spoil the database cache. That is why in your case 1 the number of disk reads remains the same. In your case 2, you just fetch the data that is needed by using the correct index and that will be cached, proven by your second run where it does not do any disk accesses at all.

How large is the table? If I simplify it, the Oracle buffer cache is a huge hash table, where each cell starts a linked list of pages.
When Oracle accesses the page it first takes table block's physical address (file number, and block id) and computes a hash value of it. Then it traverses the linked list of blocks bound to this particular hash value.
The blocks are ordered by using LRU (with touch count) algorithm. The newest blocks are at the begging of the list.
And now comes the answer: When Oracle sees than the segment size(table) is bigger than 5% the whole buffer cache it puts the blocks at the and of the LRU list. This should prevent the whole buffer cache gets "starved out" by a single query execution. The database is concurrent environment and if a single query execution invalidated all the previously cached data, it would be bad for other users.
PS: I'm not sure about the 5% threshold, the value may vary between Oracle versions.
PS1: you are probably not using ASM storage. Oracle might (or might not) use so called DIRECT file access. When it is enabled the database tells to the OS kernel, that disk operations should not be cached by OS's buffer cache. When DIRECT_IO is disabled(default option for file storage), then your disk data might by also cached on OS level, but Oracle can not see it.

Related

How does Oracle decide which blocks are absent in buffer cache for a query?

Assume that we executed the following query.
select * from employees where salary > 10000;
After some time, we executed the following query.
select * from employees where salary > 500;
The second one tend to return more blocks. But we already have some of these blocks in the buffer cache because of the previous query. Maybe some of them are deleted form the buffer cache, but some or all the blocks from the first query may still exist there. So here, the database server should know which blocks already exist and which ones to read from the disc additionally.
My question is, how does the database find and decide which blocks to read from the disc additionally?
Oracle uses the LRU technique ( which stands for 'least recently used'). It is a computer algorithm used to manage data in a cache. When a cache becomes full and you need space for new things - you discard the least recently used items first (things you haven't used for a while but are in the cache consuming space).
It is not specific to data blocks - and data blocks are not really kept in an LRU list, they are managed by a touch count these days - but that touch count algorithm is very much like an LRU so you can think of it that way.
In short, when you hear LRU, think of a cache that manages some data (any data), and tends to discard items from the cache based on whether they have been used recently or not. The more recently something has been used - the more likely it is to stay in the cache.
Each block has a DBA - data block address - that consists of a file# and block#. This uniquely identifies a block in a database. Oracle uses that "key" to identify the block in the buffer cache.
If you run a query, if some blocks are not in the cache, it is because the LRU has cleared them in order to allocate more things that are more recently used. It is not guarantee, but if you need that kind of guarantee, you can use different pools in the buffer cache, mainly you can use the KEEP pool to maintain frequently accessed segments in the buffer cache.
Hope it clarifies.

How Buffer cache works in oracle database

My question is, in Oracle database if there is a 5Gb table and the SGA size is 10GB, if I select the 5Gb table, it will fit into 10GB SGA size.
But if my table is more than 10GB and my SGA size is 5Gb, then how do select queries work, does it display all the rows of 10GB table, and how does the buffer CACHE works?
If we have a table which is of more size than than the SGA, then we will have to define the size of SGA again to the required size, and then the problem is solved.
we can define the size of SGA and PGA while creating the database.
Buffered Cache Working:
When we request data from disk, at minimum oracle reads one block.Even if we request only one row, many rows in the same table are likely to be retrieved, which lie in the same block. The same goes for column.
A block in the buffered cache can be in one of the three states:
Free: Currently not used.
Pinned: Currently being accessed.
Dirty: Block has been modified but currently not been written to disk.
A block write up is triggered when one of the following happens:
The database is issued a shutdown command.
A full or partial checkpoint occurs.
A recovery time threshold occurs, which is again set by us.
A free block is needed and none are found after a given amount of time( we use LRU algorithm here).
Certain Data Definition Language commands(DDL).
Every three seconds. There are many other reasons, the algorithm is complex and can change with each release of oracle.

Sequence cache and performance

I could see the DBA team advises to set the sequence cache to a higher value at the time of performance optimization. To increase the value from 20 to 1000 or 5000.The oracle docs, says the the cache value,
Specify how many values of the sequence the database preallocates and keeps in memory for faster access.
Somewhere in the AWR report I can see,
select SEQ_MY_SEQU_EMP_ID.nextval from dual
Can any performance improvement be seen if I increase the cache value of SEQ_MY_SEQU_EMP_ID.
My question is:
Is the sequence cache perform any significant role in performance? If so how to know what is the sufficient cache value required for a sequence.
We can get the sequence values from oracle cache before them used out. When all of them were used, oracle will allocate a new batch of values and update oracle data dictionary.
If you have 100000 records need to insert and set the cache size is 20, oracle will update data dictionary 5000 times, but only 20 times if you set 5000 as cache size.
More information maybe help you: http://support.esri.com/en/knowledgebase/techarticles/detail/20498
If you omit both CACHE and NOCACHE, then the database caches 20 sequence numbers by default. Oracle recommends using the CACHE setting to enhance performance if you are using sequences in an Oracle Real Application Clusters environment.
Using the CACHE and NOORDER options together results in the best performance for a sequence. CACHE option is used without the ORDER option, each instance caches a separate range of numbers and sequence numbers may be assigned out of order by the different instances. So more the value of CACHE less writes into dictionary but more sequence numbers might be lost. But there is no point in worrying about losing the numbers, since rollback, shutdown will definitely "lose" a number.
CACHE option causes each instance to cache its own range of numbers, thus reducing I/O to the Oracle Data Dictionary, and the NOORDER option eliminates message traffic over the interconnect to coordinate the sequential allocation of numbers across all instances of the database. NOCACHE will be SLOW...
Read this
By default in ORACLE cache in sequence contains 20 values. We can redefine it by given cache clause in sequence definition. Giving cache caluse in sequence benefitted into that when we want generate big integers then it takes lesser time than normal, otherwise there are no such drastic performance increment by declaring cache clause in sequence definition.
Have done some research and found some relevant information in this regard:
We need to check the database for sequences which are high-usage but defined with the default cache size of 20 - the performance
benefits of altering the cache size of such a sequence can be
noticeable.
Increasing the cache size of a sequence does not waste space, the
cache is still defined by just two numbers, the last used and the
high water mark; it is just that the high water mark is jumped by a
much larger value every time it is reached.
A cached sequence will return values exactly the same as a non-cached
one. However, a sequence cache is kept in the shared pool just as
other cached information is. This means it can age out of the shared
pool in the same way as a procedure if it is not accessed frequently
enough. Everything is the cache is also lost when the instance is
shut down.
Besides spending more time updating oracle data dictionary having small sequence caches can have other negative effects if you work with a Clustered Oracle installation.
In Oracle 10g RAC Grid, Services and Clustering 1st Edition by Murali Vallath it is stated that if you happen to have
an Oracle Cluster (RAC)
a non-partitioned index on a column populated with an increasing sequence value
concurrent multi instance inserts
you can incur in high contention on the rightmost index block and experience a lot of Cluster Waits (up to 90% of total insert time).
If you increase the size of the relevant sequence cache you can reduce the impact of Cluster Waits on your index.

Are random updates disk bound mostly in standard and append only databases?

If I have large dataset and do random updates then I think updates are mostly disk bounded (in case append only databases there is not about seeks but about bandwidth I think). When I update record slightly one data page must be updated, so if my disk can write 10MB/s of data and page size is 16KB then i can have max 640 random updates per second. In append only databases apout 320 per second bacause one update can take two pages - index and data. In other databases bacause of ranom seeks to update page in place can be even worse like 100 updates per second.
I assume that one page in cache has only one update before write (random updates). Going forward the same will by for random inserts around all data pages (for examle not time ordered UUID) or even worst.
I refer to the situation when dirty pages (after update) must be flushed to disk and synced (can't longer stay in cache). So updates per second count is in this situation disk bandwidth bounded? Are my calculations like 320 updates per second likely? Maybe I am missing something?
"It depends."
To be complete, there are other things to consider.
First, the only thing distinguishing a random update from an append is the head seek involved. A random update will have the head dancing all over the platter, whereas an append will ideally just track like record player. This also assumes that each disk write is the full write and completely independent of all other writes.
Of course, that's in a perfect world.
With most modern databases, each update will typically involve, at a minimum, 2 writes. One for the actual data, the other for the log.
In a typical scenario, if you update a row, the database will make the change in memory. If you commit that row, the database will acknowledge that by making a note in the log, while keeping the actual dirty page in memory. Later, when the database checkpoints it will right the dirty pages to the disk. But when it does this, it will sort the blocks and write them as sequentially as it can. Then it will write a checkpoint to the log.
During recovery when the DB crashed and could not checkpoint, the database reads the log up to the last checkpoint, "rolls it forward" and applies those changes to actual disk page, marks the final checkpoint, then makes the system available for service.
The log write is sequential, the data writes are mostly sequential.
Now, if the log is part of a normal file (typical today) then you write the log record, which appends to the disk file. The FILE SYSTEM will then (likely) append to ITS log that change you just made so that it can update it's local file system structures. Later, the file system will, also, commit its dirty pages and make it's meta data changes permanent.
So, you can see that even a simple append can invoke multiple writes to the disk.
Now consider an "append only" design like CouchDB. What Couch will do, is when you make a simple write, it does not have a log. The file is its own log. Couch DB files grow without end, and need compaction during maintenance. But when it does the write, it writes not just the data page, but any indexes affected. And when indexes are affected, then Couch will rewrite the entire BRANCH of the index change from root to leaf. So, a simple write in this case can be more expensive than you would first think.
Now, of course, you throw in all of the random reads to disrupt your random writes and it all get quite complicated quite quickly. What I've learned though is that while streaming bandwidth is an important aspect of IO operations, overall operations per second are even more important. You can have 2 disks with the same bandwidth, but the one with the slower platter and/or head speed will have fewer ops/sec, just from head travel time and platter seek time.
Ideally, your DB uses dedicated raw storage vs a file system for storage, but most do not do that today. The advantages of file systems based stores operationally typically outweigh the performance benefits.
If you're on a file system, then preallocated, sequential files are a benefit so that your "append only" isn't simply skipping around other files on the file system, thus becoming similar to random updates. Also, by using preallocated files, your updates are simply updating DB data structures during writes rather than DB AND file system data structures as the file expands.
Putting logs, indexes, and data on separate disks allow multiple drives to work simultaneously with less interference. Your log can truly be append only for example compared to fighting with the random data reads or index updates.
So, all of those things factor in to throughput on DBs.

Optimizing massive insert performance...?

Given: SQL Server 2008 R2. Quit some speedin data discs. Log discs lagging.
Required: LOTS LOTS LOTS of inserts. Like 10.000 to 30.000 rows into a simple table with two indices per second. Inserts have an intrinsic order and will not repeat, as such order of inserts must not be maintained in short term (i.e. multiple parallel inserts are ok).
So far: accumulating data into a queue. Regularly (async threadpool) emptying up to 1024 entries into a work item that gets queued. Threadpool (custom class) has 32 possible threads. Opens 32 connections.
Problem: performance is off by a factor of 300.... only about 100 to 150 rows are inserted per second. Log wait time is up to 40% - 45% of processing time (ms per second) in sql server. Server cpu load is low (4% to 5% or so).
Not usable: bulk insert. The data must be written as real time as possible to the disc. THis is pretty much an archivl process of data running through the system, but there are queries which need access to the data regularly. I could try dumping them to disc and using bulk upload 1-2 times per second.... will give this a try.
Anyone a smart idea? My next step is moving the log to a fast disc set (128gb modern ssd) and to see what happens then. The significant performance boost probably will do things quite different. But even then.... the question is whether / what is feasible.
So, please fire on the smart ideas.
Ok, anywering myself. Going to give SqlBulkCopy a try, batching up to 65536 entries and flushing them out every second in an async fashion. Will report on the gains.
I'm going through the exact same issue here, so I'll go through the steps i'm taking to improve my performance.
Separate the log and the dbf file onto different spindle sets
Use basic recovery
you didn't mention any indexing requirements other than the fact that the order of inserts isn't important - in this case clustered indexes on anything other than an identity column shouldn't be used.
start your scaling of concurrency again from 1 and stop when your performance flattens out; anything over this will likely hurt performance.
rather than dropping to disk to bcp, and as you are using SQL Server 2008, consider inserting multiple rows at a time; this statement inserts three rows in a single sql call
INSERT INTO table VALUES ( 1,2,3 ), ( 4,5,6 ), ( 7,8,9 )
I was topping out at ~500 distinct inserts per second from a single thread. After ruling out the network and CPU (0 on both client and server), I assumed that disk io on the server was to blame, however inserting in batches of three got me 1500 inserts per second which rules out disk io.
It's clear that the MS client library has an upper limit baked into it (and a dive into reflector shows some hairy async completion code).
Batching in this way, waiting for x events to be received before calling insert, has me now inserting at ~2700 inserts per second from a single thread which appears to be the upper limit for my configuration.
Note: if you don't have a constant stream of events arriving at all times, you might consider adding a timer that flushes your inserts after a certain period (so that you see the last event of the day!)
Some suggestions for increasing insert performance:
Increase ADO.NET BatchSize
Choose the target table's clustered index wisely, so that inserts won't lead to clustered index node splits (e.g. autoinc column)
Insert into a temporary heap table first, then issue one big "insert-by-select" statement to push all that staging table data into the actual target table
Apply SqlBulkCopy
Choose "Bulk Logged" recovery model instad of "Full" recovery model
Place a table lock before inserting (if your business scenario allows for it)
Taken from Tips For Lightning-Fast Insert Performance On SqlServer

Resources