I've come across a unique scenario in Oracle RAC that I cannot seem to find documentation to clarify whether my observation indicates some sort of database configuration error or if this is just a plausible yet rare/corner case scenario.
The RAC environment consists of 2 nodes, T1 as Node 1, and T2 as Node 2. I've come across a few situations where the contents of V$ARCHIVED_LOG shows the following rows:
NAME
SEQUENCE#
THREAD#
DEST_ID
FIRST_CHANGE#
NEXT_CHANGE#
CREATOR
+ARCH/.../thread_1_seq_400732.396.1103796165
400732
1
1
910931813204
910932887346
ARCH
+ARCH/.../thread_2_seq_400732.1793.1110311863
400732
2
1
937479187292
937479244032
ARCH
I can see that both logs have different SCN ranges which I would expect, however, what I found unexpected was the sharing of the same sequence for the logs.
Can someone explain how this is possible to share the same SEQUENCE# here with these two entries?
Related
I've got a oracle MV with 100+ million rows. Much of the our code calls against this view, so I would like to keep the syntax intact and make sure these calls are very fast.
Because of business logic I know that less than 1 million rows would be enough to answer 99%+ of all calls. Say these million rows are in partition 1. I wonder:
Did Oracle (probably) find that out by itself and usually returns cached results instead of actually scanning the table?
Can I tell Oracle to always check first partition first and then go by whatever it likes to continue.
Can I do any else?
If (1) is the case, then I guess we are as fast as we can be.
Else how could I make (2) work? or is there any (3) that I did not think about yet?
Can you help out?
Greetings,
Peter
Partial answer 2)
You can say to Oracle from which partition do you want to select data:
SELECT *
FROM partitioned_table PARTITION (partition_name) s
WHERE ....;
I guess, the way I wanne do is just not possible. I post APCs comment as it seems closeds to my needs:
APC:
Unlikely. Generally, partition pruning (searching just the partitions which Oracle knows have all the required records) only works with queries which use the partition key. Possibly if you have something which absolutely correlates with the date and you gather histogram stats then Oracle may be able to establish the correlation and still look in the one partition. But I can't be sure as I've only ever worked with Partitions whose key is always part of the query (what a narrow life I've led). – APC Aug 21 at 10:24
I have loaded data into Druid from Hive and I haven't used any HLL columns.
When i run a COUNT(DISTINCT mycol) query in Druid, I do not get exact counts. The counts seem to be close, but do not match with what i have in Hive.
Why could Druid not be giving an exact count even though i haven't mentioned anything about HLL? Alternatively, is there a way to get exact count distincts in Druid?
Found an old post from 2014 regarding same issue https://groups.google.com/forum/#!topic/druid-development/AMSOVGx5PhQ, am not sure if the current version of Druid supports exact count distincts.
The COUNT(DISTINCT col) aggregation functions by default uses a variant of HyperLogLog, a fast approximate distinct counting algorithm. Druid SQL will switch to exact distinct counts if you set "useApproximateCountDistinct" to "false", either through query context or through broker configuration.( refer http://druid.io/docs/latest/querying/sql.html )
To get actual distinct count set druid.sql.planner.useApproximateCountDistinct to false . ( refer http://druid.io/docs/latest/configuration/index.html#broker-node-configs ) Also note that there is limitation in exact mode, only one distinct count per query is permitted.
Hard to tell what is happening without the DDLs and more clues...
I am guessing data got rolled up when indexed by Druid. When you index data with a Granularity other than none, it can get rolled up to the Granularity level.
I had similar issue and in my case it was because of data rollup as Slim mentioned in his answer.
Basically if your data is more granular than your segmentGranularity then it will get rolled up automatically. If you set segmentGranularity to None then it won't get rolled up.
One more thing I observed in my case was, even if I have segment granularity None but if my timestamp and all other columns for two different rows are same then it was getting auto merged into 1 row.
This particular behavior was OK for me as I was also looking for distinct count as you do.
the analysis was working on version 10, after migration to version 11, it started giving error
[nQSError: 14025] No fact table exists at the requested level of detail
11g is more strict that 10g was in terms of a conformed data model. These types of errors almost always stem from something in the BMM being set up incorrectly, so I would start there. There are several things it could be.
Check your levels on the LTS being used. Set any levels to total for things you want the LTS to work with, but that it does not join to. This will force OBIEE to ignore that item in the join criteria, since there is no join. Set these levels on the column as well.
If you have levels set to detail, make sure the physical join actually exists.
Run a consistency check, and look for any warnings where it says no physical join, or logical tables source joins table at incorrect grain (I can't remember the exact wording, but you will know it when you see it).
Compare the queries generated in 10g ana 11g, or at least take a look at the query generated in 11g and look at your RPD.
Regards
Recently I used Oracle 11g database to do my homework. I had 12 tables, like trip_data_11 and trip_data_12.
They have same structure and the number of records is almost the same. I created the same indexes on each table.
So for trip_data_11 table:
create index pick_add_11 on trip_data_11(pickup_longitude,pickup_latitude);
create index drop_add_11 on trip_data_11(dropoff_longitude,dropoff_latitude);
The same operation to trip_data_12.
Then I used the following select statement to select the taxi numbers per day.
SELECT
COUNT(DISTINCT(td.medallion)) AS taxi_num
FROM
SYS.TRIP_DATA_11 td
WHERE
(td.pickup_longitude >= -74.2593 AND td.pickup_longitude <= -73.7011
AND td.pickup_latitude >= 40.4770 AND td.pickup_latitude <= 40.9171
)
AND
(td.dropoff_longitude >= -74.2593 AND td.dropoff_longitude <= -73.7011
AND td.dropoff_latitude >= 40.4770 AND td.dropoff_latitude <= 40.9171
)
AND
td.trip_distance > 0
AND
td.passenger_count > 0
GROUP BY
regexp_substr(td.pickup_datetime,'\d{4}-\d{2}-\d{2}')
ORDER BY
regexp_substr(td.pickup_datetime,'\d{4}-\d{2}-\d{2}');
It costs 38sec。When I changed the table name to SYS.TRIP_DATA_12, the problem coming, it costs more than 2 hours.
What's more, it did not end. I don't know why.
Today I ask my classmate and he said: clear the cache. So I used the following statements to do it.
alter system flush shared_pool;
alter system flush buffer_cache;
alter system flush global context;
Now when I use the same select statement for SYS.TRIP_DATA_11 I get the same poor performance like SYS.TRIP_DATA_12. Why?
It seems like your classmate was having a good joke at your expense.
Clearly your query was only performing well because you had a warm buffer cache full of all the data you needed from TRIP_DATA_11. By flushing the caches you have zapped all that, and now you have the same bad performance for all tables.
Tuning queries is hard, because there are lots of possibilities. Please read the documentation on it.
To pick just one thing: you're searching ranges, which is problematic. How many rows fill -74.2593 to -73.7011 ? It might be a lot more than say -71.00 to -68.59 even though that's a broader range. Understanding your data - its volume, its distribution and its skew - is crucial.
As a first step learn how to use EXPLAIN PLAN. Find out more. To get better plans, gather statistics on your tables and their indexes, using DBMS_STATS package. Find out more.
One tip. Oracle only uses one index to access a table. So it will choose pick_add_11 or drop_add_11 but not both. It will then read all the matching records from the table and filter them by the other criteria. You may get much better performance from a index designed to service this query:
create index add_11 on trip_data_11
(pickup_longitude
, pickup_latitude
, dropoff_longitude
, dropoff_latitude
, trip_distance
, passenger_count )
;
The select statement will execute the entire filter against this index and only touch the table to get the MEDALLION values. (You could add medallion to the index too). Experiment with the column order. As latitude has a narrower range than longitude probably that should go first; maybe drop-off value should appear before pick-up. You want an index in which the greatest number of related records are clustered together.
Indexes like this can be an overhead, so we wouldn't want to maintain too many of them in real life. But they are a valuable technique for tuning expensive queries which are run frequently.
Oh, and #Justin's right: don't use SYS for doing application work. Even for a school assignment you should create a fresh schema and create your tables, etc in that.
We had a performance issue in our production environment.
We identified that Oracle was executing queries using a Index which is not correct.
The queries have in their WHERE CLAUSE all the columns of the Primary Key (and nothing else).
After rebuilding of Index and Gather Statistics, Oracle started using the PK_INDEX. And the plan of execution indicated Index Unique Scan.
It worked fine for a while and then Oracle started using the Wrong Index again. The index that it uses now comprise of 2 Columns of which only 1 appears in the WHERE CLAUSE of the query. Now the plan of execution indicates INDEX RANGE SCAN and the system is very slow.
Please let me know how we could get to the root of this issue.
Try gathering stats again. If you get the expected execution plan then it means that the changes made to the table since the last stats gathering made oracle think the least favorite execution plan is better.
so, You'r question here is really "How can I maintain plan stability ?"
You have several options
Use hints in your query to indicate the exact access path.
Use
outlines
I personally don't like these two approaches because if your data will change in the future in such a manner that the execution plan should change, you'll get lousy performance.
So the third option (and my personal favorite) is
enable periodic statistics gathering. Oracle knows to spot the
changes and incrementally update relevant stats.