How Buffer cache works in oracle database - oracle

My question is, in Oracle database if there is a 5Gb table and the SGA size is 10GB, if I select the 5Gb table, it will fit into 10GB SGA size.
But if my table is more than 10GB and my SGA size is 5Gb, then how do select queries work, does it display all the rows of 10GB table, and how does the buffer CACHE works?

If we have a table which is of more size than than the SGA, then we will have to define the size of SGA again to the required size, and then the problem is solved.
we can define the size of SGA and PGA while creating the database.
Buffered Cache Working:
When we request data from disk, at minimum oracle reads one block.Even if we request only one row, many rows in the same table are likely to be retrieved, which lie in the same block. The same goes for column.
A block in the buffered cache can be in one of the three states:
Free: Currently not used.
Pinned: Currently being accessed.
Dirty: Block has been modified but currently not been written to disk.
A block write up is triggered when one of the following happens:
The database is issued a shutdown command.
A full or partial checkpoint occurs.
A recovery time threshold occurs, which is again set by us.
A free block is needed and none are found after a given amount of time( we use LRU algorithm here).
Certain Data Definition Language commands(DDL).
Every three seconds. There are many other reasons, the algorithm is complex and can change with each release of oracle.

Related

How does Oracle decide which blocks are absent in buffer cache for a query?

Assume that we executed the following query.
select * from employees where salary > 10000;
After some time, we executed the following query.
select * from employees where salary > 500;
The second one tend to return more blocks. But we already have some of these blocks in the buffer cache because of the previous query. Maybe some of them are deleted form the buffer cache, but some or all the blocks from the first query may still exist there. So here, the database server should know which blocks already exist and which ones to read from the disc additionally.
My question is, how does the database find and decide which blocks to read from the disc additionally?
Oracle uses the LRU technique ( which stands for 'least recently used'). It is a computer algorithm used to manage data in a cache. When a cache becomes full and you need space for new things - you discard the least recently used items first (things you haven't used for a while but are in the cache consuming space).
It is not specific to data blocks - and data blocks are not really kept in an LRU list, they are managed by a touch count these days - but that touch count algorithm is very much like an LRU so you can think of it that way.
In short, when you hear LRU, think of a cache that manages some data (any data), and tends to discard items from the cache based on whether they have been used recently or not. The more recently something has been used - the more likely it is to stay in the cache.
Each block has a DBA - data block address - that consists of a file# and block#. This uniquely identifies a block in a database. Oracle uses that "key" to identify the block in the buffer cache.
If you run a query, if some blocks are not in the cache, it is because the LRU has cleared them in order to allocate more things that are more recently used. It is not guarantee, but if you need that kind of guarantee, you can use different pools in the buffer cache, mainly you can use the KEEP pool to maintain frequently accessed segments in the buffer cache.
Hope it clarifies.

Why is my H2 database 7x larger on disk than it should be?

I have an H2 database that has ballooned to several Gigabytes in size, causing all sorts of operational problems. The database size didn't seem right. So I took one little slice of it, just one table, to try to figure out what's going on.
I brought this table into a test environment:
The columns add up to 80 bytes per row, per my calculations.
The table has 280,000 rows.
For this test, all indexes were removed.
The table should occupy approximately
80 bytes per row * 280,000 rows = 22.4 MB on disk.
However, it is physically taking up 157 MB.
I would expect to see some overhead here and there, but why is this database a full 7x larger than can be reasonably estimated?
UPDATE
Output from CALL DISK_SPACE_USED
There's always indices, etc. to be taken into account.
Can you try:
CALL DISK_SPACE_USED('my_table');
Also, I would also recommend running SHUTDOWN DEFRAG and calculating the size again.
Setting MV_STORE=FALSE on database creation solves the problem. Whole database (not the test slice from the example) is now approximately 10x smaller.
Update
I had to revisit this topic recently and had to run a comparison to MySQL. On my test dataset, when MV_STORE=FALSE, the H2 database takes up 360MB of disk space, while the same data on MySQL 5.7 InnoDB with default-ish configurations takes up 432MB. YMMV.

Disk reads not reduced while running a query again

Case 1:
I am running a Sql query in oracle, its a simple select statement on a table with no index. The stats information that I got for the query shows a TABLE FULL ACCESS, 176k buffer_gets and 111k disk_reads. I ran the same query again and checked the stats result, only the time got reduced but there is no change in buffer_gets and disk reads. As the data is cached the time is getting reduced y not buffer and disk reads?
Case 2:
Now I have created an index for the table and ran the same query and saw the stats result, I got TABLE ACCESS BY INDEX and few buffer gets and disk reads. when I ran the same query again I got the same result with zero disk reads and reduction in time.
why disk reads not reduced in case 1? When i run a query what are all gets cached?
As far as I have noticed disk reads remains the same in table access full and joins.
A full scan on a large table generally does not spoil the database cache. That is why in your case 1 the number of disk reads remains the same. In your case 2, you just fetch the data that is needed by using the correct index and that will be cached, proven by your second run where it does not do any disk accesses at all.
How large is the table? If I simplify it, the Oracle buffer cache is a huge hash table, where each cell starts a linked list of pages.
When Oracle accesses the page it first takes table block's physical address (file number, and block id) and computes a hash value of it. Then it traverses the linked list of blocks bound to this particular hash value.
The blocks are ordered by using LRU (with touch count) algorithm. The newest blocks are at the begging of the list.
And now comes the answer: When Oracle sees than the segment size(table) is bigger than 5% the whole buffer cache it puts the blocks at the and of the LRU list. This should prevent the whole buffer cache gets "starved out" by a single query execution. The database is concurrent environment and if a single query execution invalidated all the previously cached data, it would be bad for other users.
PS: I'm not sure about the 5% threshold, the value may vary between Oracle versions.
PS1: you are probably not using ASM storage. Oracle might (or might not) use so called DIRECT file access. When it is enabled the database tells to the OS kernel, that disk operations should not be cached by OS's buffer cache. When DIRECT_IO is disabled(default option for file storage), then your disk data might by also cached on OS level, but Oracle can not see it.

Sequence cache and performance

I could see the DBA team advises to set the sequence cache to a higher value at the time of performance optimization. To increase the value from 20 to 1000 or 5000.The oracle docs, says the the cache value,
Specify how many values of the sequence the database preallocates and keeps in memory for faster access.
Somewhere in the AWR report I can see,
select SEQ_MY_SEQU_EMP_ID.nextval from dual
Can any performance improvement be seen if I increase the cache value of SEQ_MY_SEQU_EMP_ID.
My question is:
Is the sequence cache perform any significant role in performance? If so how to know what is the sufficient cache value required for a sequence.
We can get the sequence values from oracle cache before them used out. When all of them were used, oracle will allocate a new batch of values and update oracle data dictionary.
If you have 100000 records need to insert and set the cache size is 20, oracle will update data dictionary 5000 times, but only 20 times if you set 5000 as cache size.
More information maybe help you: http://support.esri.com/en/knowledgebase/techarticles/detail/20498
If you omit both CACHE and NOCACHE, then the database caches 20 sequence numbers by default. Oracle recommends using the CACHE setting to enhance performance if you are using sequences in an Oracle Real Application Clusters environment.
Using the CACHE and NOORDER options together results in the best performance for a sequence. CACHE option is used without the ORDER option, each instance caches a separate range of numbers and sequence numbers may be assigned out of order by the different instances. So more the value of CACHE less writes into dictionary but more sequence numbers might be lost. But there is no point in worrying about losing the numbers, since rollback, shutdown will definitely "lose" a number.
CACHE option causes each instance to cache its own range of numbers, thus reducing I/O to the Oracle Data Dictionary, and the NOORDER option eliminates message traffic over the interconnect to coordinate the sequential allocation of numbers across all instances of the database. NOCACHE will be SLOW...
Read this
By default in ORACLE cache in sequence contains 20 values. We can redefine it by given cache clause in sequence definition. Giving cache caluse in sequence benefitted into that when we want generate big integers then it takes lesser time than normal, otherwise there are no such drastic performance increment by declaring cache clause in sequence definition.
Have done some research and found some relevant information in this regard:
We need to check the database for sequences which are high-usage but defined with the default cache size of 20 - the performance
benefits of altering the cache size of such a sequence can be
noticeable.
Increasing the cache size of a sequence does not waste space, the
cache is still defined by just two numbers, the last used and the
high water mark; it is just that the high water mark is jumped by a
much larger value every time it is reached.
A cached sequence will return values exactly the same as a non-cached
one. However, a sequence cache is kept in the shared pool just as
other cached information is. This means it can age out of the shared
pool in the same way as a procedure if it is not accessed frequently
enough. Everything is the cache is also lost when the instance is
shut down.
Besides spending more time updating oracle data dictionary having small sequence caches can have other negative effects if you work with a Clustered Oracle installation.
In Oracle 10g RAC Grid, Services and Clustering 1st Edition by Murali Vallath it is stated that if you happen to have
an Oracle Cluster (RAC)
a non-partitioned index on a column populated with an increasing sequence value
concurrent multi instance inserts
you can incur in high contention on the rightmost index block and experience a lot of Cluster Waits (up to 90% of total insert time).
If you increase the size of the relevant sequence cache you can reduce the impact of Cluster Waits on your index.

insert data to Berkeley DB very poor performance in Big Data

I have already held the 80G Berkeley DB file. I measure the average insert speed is 8ms for one record(32 byte key/100 byte value) without transaction.
Compare to insert to empty database with same interface, the average speed is 3~6 us.
If you insert data when your buffers are empty it can perform every well, once the buffers of your system are full, they cannot continue until some buffer space has been clearer. e.g. the time to write data to a HDD is typically 8 ms.
I would test busts of say one million records, after the system is quiet to see what the latency is like when the buffers are not full.

Resources