What is the max size that i can provide with the cache clause in oracle sequences? - oracle

CREATE SEQUENCE S1
START WITH 100
INCREMENT BY 10
CACHE 10000000000000000000000000000000000000000000000000000000000000000000000000
If i fire a query with such a big size even if it creates the sequence s1.
What is the max size that I can provide with it???

http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/statements_6015.htm#SQLRF01314
Quote from 11g docs ...
Specify how many values of the sequence the database preallocates and keeps in memory for faster access. This integer value can have 28 or fewer digits. The minimum value for this parameter is 2. For sequences that cycle, this value must be less than the number of values in the cycle. You cannot cache more values than will fit in a given cycle of sequence numbers. Therefore, the maximum value allowed for CACHE must be less than the value determined by the following formula:
(CEIL (MAXVALUE - MINVALUE)) / ABS (INCREMENT)
If a system failure occurs, then all cached sequence values that have not been used in committed DML statements are lost. The potential number of lost values is equal to the value of the CACHE parameter.
Determining the optimal value is a matter of determining the rate at which you will generate new values, and thus the frequency with which recursive SQL will have to be executed to update the sequence record in the data disctionanry. Typically it's higher for RAC systems to avoid contention, but then they are also generally busier as well. Performance problems relating to insufficient sequence cache are generally easy to sport through AWR/Statspack and other diagnostic tools.

Looking in the Oracle API, I don't see a maximum cache size specified (Reference).
Here are some guidelines on setting an optimal cache size.

Related

Retrieving sequential numbers in a distributed system

System consists of tens of peer servers (none of them is leader/master).
To create an entity, service should acquire the next sequential number basing on some group key: there are different sequences for each group key.
Let's say to create instance of entity A, service has to get sequence with group key A, while to create entity B, service has to get sequence number with group key B.
Getting the same number twice is prohibited. Missing numbers are allowed.
Currently I have implemented solution with a RDBMS, having a record for each of the group keys and updating its current sequence value in a transaction:
UPDATE SEQUENCES SET SEQ_ID=SEQ_ID + 1 WHERE KEY = ?
However, this approach only allows to get 200-300 queries per seconds because of locking and synchronisation.
Another approach I consider is to have local buffer of sequences at the each node. Once buffer is empty, service queries DB to get the next batch of ids and store them locally: UPDATE SEQUENCES SET SEQ_ID=SEQ_ID + 1000 WHERE KEY = ? if the batch size is 1000. This may help to lower contention. However, if node goes down it loses all these acquired sequence numbers, which, if happened frequently, can lead to overflowing the maximum value of sequence (e.g. max int).
I don't know in advance, how many sequence numbers will be needed.
I don't want to introduce additional dependencies between servers and have one of them to generate sequence numbers and serving to the others.
What are the general ways to solve similar problems?
Which other RDBMS-based approaches can be considered?
Which other NOT RDBMS-based approaches can be considered?
What other problems can happen with local buffer solution?

When sequence was updated in oracle

I've found a gap of 18 numbers in my oracle sequence that appeared during holidays and found no traces of this crime in log files. Can I see somehow a time when sequence is updated or any other way to trace the disappearance of sequence numbers?
As the other answers/comments state, sequences are never guaranteed to be gap free. Gaps can happen in many ways.
One of the things about sequences are that they default to CACHE 20. That means that when NEXTVAL is called, the sequence in the database is actually increased by 20, the value is returned, and the next 19 values are just returned from memory (the shared pool.)
If the sequence is aged out of the shared pool, or the shared pool is flushed or the database restarted, the values "left over" in the sequence memory cache is lost, as the sequence next time picks up the value from the database, which was 20 higher than last time.
As you state it happened during holidays, I guess it to be a likely cause that your sequence used 2 values from memory, then the DBA did maintenance in the holidays that caused the shared pool to flush, and the remaining 18 values in the sequence memory cache is gone.
That is normal behaviour and the reason why sequences work fast. You can get "closer" to gap-free by NOCACHE NOORDER, but it will cost performance and never be quite gap-free anyway.
A sequence is not guaranteed to produce a gap free sequence of numbers. It is used to produce a series of (unique) numbers that is highly scalable in a multi-user environment.
If your application requires that some numeric identifier on a table is sequential without gaps then you will need another way of generating those identifiers.
The oracle sequence are NOORDER in default. please refer the oracle documentation https://docs.oracle.com/cd/B13789_01/server.101/b10759/statements_6014.htm . hence NOORDER behavior cannot be as expected. You can check whether your oracle sequence is in ORDER if not then it is some other problem.

Sequence cache and performance

I could see the DBA team advises to set the sequence cache to a higher value at the time of performance optimization. To increase the value from 20 to 1000 or 5000.The oracle docs, says the the cache value,
Specify how many values of the sequence the database preallocates and keeps in memory for faster access.
Somewhere in the AWR report I can see,
select SEQ_MY_SEQU_EMP_ID.nextval from dual
Can any performance improvement be seen if I increase the cache value of SEQ_MY_SEQU_EMP_ID.
My question is:
Is the sequence cache perform any significant role in performance? If so how to know what is the sufficient cache value required for a sequence.
We can get the sequence values from oracle cache before them used out. When all of them were used, oracle will allocate a new batch of values and update oracle data dictionary.
If you have 100000 records need to insert and set the cache size is 20, oracle will update data dictionary 5000 times, but only 20 times if you set 5000 as cache size.
More information maybe help you: http://support.esri.com/en/knowledgebase/techarticles/detail/20498
If you omit both CACHE and NOCACHE, then the database caches 20 sequence numbers by default. Oracle recommends using the CACHE setting to enhance performance if you are using sequences in an Oracle Real Application Clusters environment.
Using the CACHE and NOORDER options together results in the best performance for a sequence. CACHE option is used without the ORDER option, each instance caches a separate range of numbers and sequence numbers may be assigned out of order by the different instances. So more the value of CACHE less writes into dictionary but more sequence numbers might be lost. But there is no point in worrying about losing the numbers, since rollback, shutdown will definitely "lose" a number.
CACHE option causes each instance to cache its own range of numbers, thus reducing I/O to the Oracle Data Dictionary, and the NOORDER option eliminates message traffic over the interconnect to coordinate the sequential allocation of numbers across all instances of the database. NOCACHE will be SLOW...
Read this
By default in ORACLE cache in sequence contains 20 values. We can redefine it by given cache clause in sequence definition. Giving cache caluse in sequence benefitted into that when we want generate big integers then it takes lesser time than normal, otherwise there are no such drastic performance increment by declaring cache clause in sequence definition.
Have done some research and found some relevant information in this regard:
We need to check the database for sequences which are high-usage but defined with the default cache size of 20 - the performance
benefits of altering the cache size of such a sequence can be
noticeable.
Increasing the cache size of a sequence does not waste space, the
cache is still defined by just two numbers, the last used and the
high water mark; it is just that the high water mark is jumped by a
much larger value every time it is reached.
A cached sequence will return values exactly the same as a non-cached
one. However, a sequence cache is kept in the shared pool just as
other cached information is. This means it can age out of the shared
pool in the same way as a procedure if it is not accessed frequently
enough. Everything is the cache is also lost when the instance is
shut down.
Besides spending more time updating oracle data dictionary having small sequence caches can have other negative effects if you work with a Clustered Oracle installation.
In Oracle 10g RAC Grid, Services and Clustering 1st Edition by Murali Vallath it is stated that if you happen to have
an Oracle Cluster (RAC)
a non-partitioned index on a column populated with an increasing sequence value
concurrent multi instance inserts
you can incur in high contention on the rightmost index block and experience a lot of Cluster Waits (up to 90% of total insert time).
If you increase the size of the relevant sequence cache you can reduce the impact of Cluster Waits on your index.

Understanding the results of Execute Explain Plan in Oracle SQL Developer

I'm trying to optimize a query but don't quite understand some of the information returned from Explain Plan. Can anyone tell me the significance of the OPTIONS and COST columns? In the OPTIONS column, I only see the word FULL. In the COST column, I can deduce that a lower cost means a faster query. But what exactly does the cost value represent and what is an acceptable threshold?
The output of EXPLAIN PLAN is a debug output from Oracle's query optimiser. The COST is the final output of the Cost-based optimiser (CBO), the purpose of which is to select which of the many different possible plans should be used to run the query. The CBO calculates a relative Cost for each plan, then picks the plan with the lowest cost.
(Note: in some cases the CBO does not have enough time to evaluate every possible plan; in these cases it just picks the plan with the lowest cost found so far)
In general, one of the biggest contributors to a slow query is the number of rows read to service the query (blocks, to be more precise), so the cost will be based in part on the number of rows the optimiser estimates will need to be read.
For example, lets say you have the following query:
SELECT emp_id FROM employees WHERE months_of_service = 6;
(The months_of_service column has a NOT NULL constraint on it and an ordinary index on it.)
There are two basic plans the optimiser might choose here:
Plan 1: Read all the rows from the "employees" table, for each, check if the predicate is true (months_of_service=6).
Plan 2: Read the index where months_of_service=6 (this results in a set of ROWIDs), then access the table based on the ROWIDs returned.
Let's imagine the "employees" table has 1,000,000 (1 million) rows. Let's further imagine that the values for months_of_service range from 1 to 12 and are fairly evenly distributed for some reason.
The cost of Plan 1, which involves a FULL SCAN, will be the cost of reading all the rows in the employees table, which is approximately equal to 1,000,000; but since Oracle will often be able to read the blocks using multi-block reads, the actual cost will be lower (depending on how your database is set up) - e.g. let's imagine the multi-block read count is 10 - the calculated cost of the full scan will be 1,000,000 / 10; Overal cost = 100,000.
The cost of Plan 2, which involves an INDEX RANGE SCAN and a table lookup by ROWID, will be the cost of scanning the index, plus the cost of accessing the table by ROWID. I won't go into how index range scans are costed but let's imagine the cost of the index range scan is 1 per row; we expect to find a match in 1 out of 12 cases, so the cost of the index scan is 1,000,000 / 12 = 83,333; plus the cost of accessing the table (assume 1 block read per access, we can't use multi-block reads here) = 83,333; Overall cost = 166,666.
As you can see, the cost of Plan 1 (full scan) is LESS than the cost of Plan 2 (index scan + access by rowid) - which means the CBO would choose the FULL scan.
If the assumptions made here by the optimiser are true, then in fact Plan 1 will be preferable and much more efficient than Plan 2 - which disproves the myth that FULL scans are "always bad".
The results would be quite different if the optimiser goal was FIRST_ROWS(n) instead of ALL_ROWS - in which case the optimiser would favour Plan 2 because it will often return the first few rows quicker, at the cost of being less efficient for the entire query.
The CBO builds a decision tree, estimating the costs of each possible execution path available per query. The costs are set by the CPU_cost or I/O_cost parameter set on the instance. And the CBO estimates the costs, as best it can with the existing statistics of the tables and indexes that the query will use. You should not tune your query based on cost alone. Cost allows you to understand WHY the optimizer is doing what it does. Without cost you could figure out why the optimizer chose the plan it did. Lower cost does not mean a faster query. There are cases where this is true and there will be cases where this is wrong. Cost is based on your table stats and if they are wrong the cost is going to be wrong.
When tuning your query, you should take a look at the cardinality and the number of rows of each step. Do they make sense? Is the cardinality the optimizer is assuming correct? Is the rows being return reasonable. If the information present is wrong then its very likely the optimizer doesn't have the proper information it needs to make the right decision. This could be due to stale or missing statistics on the table and index as well as cpu-stats. Its best to have stats updated when tuning a query to get the most out of the optimizer. Knowing your schema is also of great help when tuning. Knowing when the optimizer chose a really bad decision and pointing it in the correct path with a small hint can save a load of time.
Here is a reference for using EXPLAIN PLAN with Oracle: http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/ex_plan.htm), with specific information about the columns found here: http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/ex_plan.htm#i18300
Your mention of 'FULL' indicates to me that the query is doing a full-table scan to find your data. This is okay, in certain situations, otherwise an indicator of poor indexing / query writing.
Generally, with explain plans, you want to ensure your query is utilizing keys, thus Oracle can find the data you're looking for with accessing the least number of rows possible. Ultimately, you can sometime only get so far with the architecture of your tables. If the costs remain too high, you may have to think about adjusting the layout of your schema to be more performance based.
In recent Oracle versions the COST represent the amount of time that the optimiser expects the query to take, expressed in units of the amount of time required for a single block read.
So if a single block read takes 2ms and the cost is expressed as "250", the query could be expected to take 500ms to complete.
The optimiser calculates the cost based on the estimated number of single block and multiblock reads, and the CPU consumption of the plan. the latter can be very useful in minimising the cost by performing certain operations before others to try and avoid high CPU cost operations.
This raises the question of how the optimiser knows how long operations take. recent Oracle versions allow the collections of "system statistics", which are definitely not to be confused with statistics on tables or indexes. The system statistics are measurements of the performance of the hardware, mostly importantly:
How long a single block read takes
How long a multiblock read takes
How large a multiblock read is (often different to the maximum possible due to table extents being smaller than the maximum, and other reasons).
CPU performance
These numbers can vary greatly according to the operating environment of the system, and different sets of statistics can be stored for "daytime OLTP" operations and "nighttime batch reporting" operations, and for "end of month reporting" if you wish.
Given these sets of statistics, a given query execution plan can be evaluated for cost in different operating environments, which might promote use of full table scans at some times or index scans at others.
The cost is not perfect, but the optimiser gets better at self-monitoring with every release, and can feedback the actual cost in comparison to the estimated cost in order to make better decisions for the future. this also makes it rather more difficult to predict.
Note that the cost is not necessarily wall clock time, as parallel query operations consume a total amount of time across multiple threads.
In older versions of Oracle the cost of CPU operations was ignored, and the relative costs of single and multiblock reads were effectively fixed according to init parameters.
FULL is probably referring to a full table scan, which means that no indexes are in use. This is usually indicating that something is wrong, unless the query is supposed to use all the rows in a table.
Cost is a number that signals the sum of the different loads, processor, memory, disk, IO, and high numbers are typically bad. The numbers are added up when moving to the root of the plan, and each branch should be examined to locate the bottlenecks.
You may also want to query v$sql and v$session to get statistics about SQL statements, and this will have detailed metrics for all kind of resources, timings and executions.

What happens to an Oracle sequence after disaster recovery?

Suppose an Oracle instance has to be recovered after a disaster. Do sequences get reset to the initial state, or the last saved state, or are cached values preserved?
Thank you very much. :-)
The sequnce values are stored in the SYSTEM.SEQ$ (I think) table, and a cache is maintained in memory of the next values to be used, with the size of that cache being dependent on the CACHE value for the sequence.
When the cache is exhausted the SEQ$ table is updated to a new value (in a non-consistent manner -- ie. without the user session's transacton control applying) and the next say 100 values (if CACHE=100) are read from memory.
Let's suppose that you're using a sequence with a cache size of 20. When you select a certain value from the sequence, say 1400, the SEQ$ table is updated to a value of 1420. Even if you rollback your transaction the SEQ$ still has that value until the next 20 sequence values have been used, at which time SEQ$ gets updated to 1440. If you have then just used value 1423 and an instance crash occurs, then when the system restarts the next value to be read from the sequnce will be 1440.
So, yes the integrity of the sequence will be preserved and numbers will not be "reissued". Note that the same applies to a graceful shutdown -- when you restart you will get a new value of 1440 in the above example. Sequences are not guaranteed to be gap free in practice for this reason (also because using a value and then rolling back does not restore that value to the cache).
Not that I have any experience with this, but I very much assume that a recovery to a consistent system change number state would also return the sequence to the last saved state. Anything else would be fairly useless in terms of recovery.
As for cached values, those are (can be) lost even when the instance shuts downs in an orderly manner (*): Instances cache a number of sequence values in memory (SGA) instead of going to the database every time. Unused sequence values that the instance has reserved can "disappear", leaving you with gaps in the sequence.
(*) 8i documentation mentions that this can happen with parallel instances (RAC), in which case the sequence may not even be strictly ascending (but still unique), 10g docs say that it happens in case of an instance failure.

Resources