How can I optimize these Postgres queries and DB performance? - performance

I need some help in optimizing the following two queries which are almost similar but data selection is a little different. Here is my table definition
CREATE TABLE public.rates (
rate_id bigserial NOT NULL PRIMARY KEY,
prefix varchar(50),
rate_name varchar(30),
rate numeric(8,6),
intrastate_cost numeric(8,6),
interstate_cost numeric(8,6),
status char(3) DEFAULT 'act'::bpchar,
min_duration integer,
call_increment integer,
connection_cost numeric(8,6),
rate_type varchar(3) DEFAULT 'lcr'::character varying,
owner_type varchar(10),
start_date timestamp WITHOUT TIME ZONE,
end_date timestamp WITHOUT TIME ZONE,
rev integer,
ratecard_id integer,
/* Keys */
CONSTRAINT rates_pkey
PRIMARY KEY (id)
) WITH (
OIDS = FALSE
);
and two queries here I am using,
SELECT
rates.* ,
rc.ratecard_prefix ,
rc.default_lrn ,
rc.lrn_lookup_method ,
customers.customer_id ,
customers.balance ,
customers.channels AS customer_channels ,
customers.cps AS customer_cps ,
customers.balance AS customer_balance
FROM
rates
JOIN ratecards rc
ON rc.card_type = 'customer' AND
rc.ratecard_id = rates.ratecard_id
JOIN customers
ON rc.customer_id = customers.customer_id
WHERE
customers.status = 'act' AND
rc.status = 'act' AND
rc.customer_id = 'AB8KA191' AND
owner_type = 'customer' AND
'17606109973' LIKE concat (rc.ratecard_prefix, rates.prefix, '%') AND
rates.status = 'act' AND
now() BETWEEN rates. start_date AND
rates.end_date AND
customers.balance > 0
ORDER BY
LENGTH(PREFIX) DESC LIMIT 1;
and the second one,
SELECT
*
FROM
rates
JOIN ratecards rc
ON rc.card_type = 'carrier' AND
rc.ratecard_id = rates.ratecard_id
JOIN carriers
ON rc.carrier_id = carriers.carrier_id
JOIN carrier_switches cswitch
ON carriers.carrier_id = cswitch.carrier_id
WHERE
rates.intrastate_cost < 0.011648 AND
owner_type = 'carrier' AND
'16093960411' LIKE concat (rates.prefix, '%') AND
rates.status = 'act' AND
carriers.status = 'act' AND
now() BETWEEN rates.start_date AND
rates.end_date AND
rates.intrastate_cost <> -1 AND
cswitch.enabled = 't' AND
rates.rate_type = 'lrn' AND
rates.min_duration >= 6
ORDER BY
rates.intrastate_cost ASC,
LENGTH(rates.prefix) DESC,
cswitch.priority DESC
I created an index on field owner_type (not shown in schema above) but the query performance is not really what is expected. CPU usage becomes too high for the DB server and everything starts to slow down. Explain output for first query is here and the second one is here.
When the number of records are less in the table, things work fine, naturally, but when the number of records increases the CPU goes higher. I currently have around 341821 records in the table.
How can I improve the query execution or possibly change the query in order to speed things up?
I have set enable_bitmapscan = off because I think this gives me better performance. If set to on, every index scan is followed up with a Bitmap heap scan.
Things did ease up a little bit by changing the query to
SELECT
rates.*,
rc.ratecard_prefix,
rc.default_lrn,
rc.lrn_lookup_method,
customers.customer_id,
customers.balance,
customers.channels AS customer_channels,
customers.cps AS customer_cps,
customers.balance AS customer_balance
FROM
rates
JOIN ratecards rc
ON rc.card_type = 'customer' AND
rc.ratecard_id = rates.ratecard_id
JOIN customers
ON rc.customer_id = customers.customer_id
WHERE
customers.status = 'act' AND
rc.status = 'act' AND
rc.customer_id = 'AB8KA191' AND
owner_type = 'customer' AND
(CONCAT (rc.ratecard_prefix, rates.prefix) IN ('16026813306',
'1602681330',
'160268133',
'16026813',
'1602681',
'160268',
'16026',
'1602',
'160',
'16',
'1')) AND
rates.status = 'act' AND
now() BETWEEN rates.start_date AND
rates.end_date AND
customers.balance > 0
ORDER BY
LENGTH(PREFIX) DESC LIMIT 1
Postgres.conf is here
But still each Postgres process takes around 25%+ CPU. I now also using pgbouncer to utilize connection pooling but still not helping.

Related

Why isn't it using the index?

Hello kind people of the internet.
I am wrecking my head trying to figure out why the optimiser isn't using my index for my query on Amazon Aurora. The query is dynamically created based on a report users have created through an applications UI, so I can't change the query per se.
The query uses these qualifiers
WHERE
table_in_question.deleted = 0
ORDER BY
table_in_question.date_modified DESC,
table_in_question.id DESC
I have an index, "my_index", which indexes these specific fields (table_in_question fields deleted, date_modified, ID) but MySQL doesn't use it.
The query takes approx 1200 ms to run. If I add FORCE INDEX (my_index) it takes about 120ms. Arguably about 10x faster - but unless I use force index, it doesn't use it.
Around 1 million rows are returned according to EXPLAIN, so I don't think it's a case of not using the index because of a low amount of rows being returned is the case.
The full query is
SELECT
case when some_table.id IS NOT NULL then some_table.id else "" end my_favorite,
table_in_question.date_entered,
table_in_question.name,
table_in_question.description,
table_in_question.pr_is_read,
table_in_question.pr_is_approved,
table_in_question.parent_type,
table_in_question.parent_id,
table_in_question.id,
table_in_question.date_modified,
table_in_question.assigned_user_id,
table_in_question.created_by
FROM
table_in_question
INNER JOIN (
SELECT
tst.team_user_is_member_of
FROM
team_sets_teams tst
INNER JOIN team_memberships team_membershipstable_in_question ON (
team_membershipstable_in_question.team_id = tst.team_id
)
AND (team_membershipstable_in_question.user_id = 'UUID')
AND (team_membershipstable_in_question.deleted = 0)
GROUP BY
tst.team_user_is_member_of
) table_in_question_tf ON table_in_question_tf.team_user_is_member_of = table_in_question.team_user_is_member_of
LEFT JOIN systemfavourites sf_table_in_question ON (sf_table_in_question.module = 'table_in_question')
AND (sf_table_in_question.record_id = table_in_question.id)
AND (sf_table_in_question.assigned_user_id = 'UUID')
AND (sf_table_in_question.deleted = '0')
INNER JOIN opportunities jt1_table_in_question ON (table_in_question.opportunity_id = jt1_table_in_question.id)
AND (jt1_table_in_question.deleted = 0)
LEFT JOIN another_table jt1_table_in_question_cstm ON jt1_table_in_question_cstm.id_c = jt1_table_in_question.id
LEFT JOIN systemfavourites table_in_question_favorite ON (table_in_question.id = table_in_question_favorite.record_id)
AND (table_in_question_favorite.deleted = '0')
AND (table_in_question_favorite.module = 'table_in_question')
AND (table_in_question_favorite.created_by = 'UUID')
LEFT JOIN users some_table ON (
some_table.id = table_in_question_favorite.modified_user_id
)
AND (some_table.deleted = 0)
WHERE
table_in_question.deleted = 0
ORDER BY
table_in_question.date_modified DESC,
table_in_question.id DESC
;
EXPLAIN shows this
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
PRIMARY
table_in_question
ALL
idx_table_in_question_tmst_id
968234
10.0
Using where; Using temporary; Using filesort
Can anyone help explain how I make an index it will actually use by default?
Thanks.

Insert into table two and update table two for BigQuery in one query

I am using StandardSQL in BigQuery. I am writing a scheduled query which inserts records into table (2). Now, given that it's sceduled, I am trying to figure out how to update records in table (2) from the sceduled query, which is always inserting records into table (2).
In particular, when there is a record in table (2) but not generated by my query then I want to update table (2) and a boolean column to No.
Below is my query, where in the query would I add the update logic for table (2)?
INSERT INTO record (airport_name, icao_address, arrival, flight_number, origin_airport_icao, destination_airport_icao)
WITH
planes_stopped_in_airport AS (
SELECT
p.IATA_airport_code,
p.airport_name,
p.airport_ISO_country_code,
p.ICAO_airport_code,
timestamp,
a.icao_address,
a.latitude,
a.longitude,
a.altitude_baro,
a.speed,
heading,
callsign,
source,
a.collection_type,
vertical_rate,
squawk_code,
icao_actype,
flight_number,
origin_airport_icao,
destination_airport_icao,
scheduled_departure_time_utc,
scheduled_arrival_time_utc,
estimated_arrival_time_utc,
tail_number,
ingestion_time
FROM
`updates` a
JOIN
Polygons p
ON
1 = 1
WHERE
a.timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 20 MINUTE) and a.timestamp <= CURRENT_TIMESTAMP()
AND ( latitude IS NULL
AND longitude IS NULL
AND callsign IS NULL
AND speed IS NULL
AND heading IS NULL
AND altitude_baro IS NULL) IS FALSE
AND ST_DWithin( ST_GeogFromText( polygon ),
ST_GeogPoint(a.longitude,
a.latitude),
10)
AND a.collection_type = '1' -- and speed < 50
AND (origin_airport_icao IS NULL
AND destination_airport_icao IS NULL) IS FALSE )
SELECT
p.airport_name,
icao_address,
MIN(timestamp) AS Arrival,
flight_number,
origin_airport_icao,
destination_airport_icao
FROM
planes_stopped_in_airport p
WHERE
flight_number NOT IN (SELECT Distinct flight_number
FROM `table(2)`
)
GROUP BY
icao_address,
p.airport_name,
flight_number,
origin_airport_icao,
destination_airport_icao
HAVING
flight_number IS NOT NULL
ORDER BY
airport_name,
arrival
You can probably do it with MERGE statement, see details in https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax#merge_statement.
If I understood your requirements correctly, you need something like
MERGE dataset.Destination T
USING (SELECT * ...) S
ON T.key = S.key
WHEN MATCHED THEN
UPDATE SET T.foo = S.foo, T.bool_flag = FALSE
WHEN NOT MATCHED THEN
INSERT ...

Looking for whether a row exists in a subquery

Spoiler alert: I am fairly new to Oracle.
I have four tables: enrollments, courses/sections, standards, and grades.
We are running Honor Roll. I have queries on the first three tables that add various constraints needed to meet honor roll requirements. Then we look at the grades table. If they have a valid enrollment, in a valid course, meeting valid standards, then count up their scores. If their score qty meets thresholds, then they get Honors.
This code is not optimized, and likely can be done in a far better/more compact way I'm sure -- however, it only gets run a few times a year, so I'm willing to trade off optimization in order to increase human readability, so that I can continue to learn the fundamentals. So far I have:
WITH validCC (SELECT CC.ID AS CCID,
CC.STUDENTID AS STUDENTID,
CC.SECTIONID AS SECTIONID,
CC.TERMID AS TERMID,
STUDENTS.DCID AS STUDENTSDCID
FROM CC
INNER JOIN STUDENTS ON CC.STUDENTID = STUDENTS.ID
WHERE TERMID in (2700,2701)
AND CC.SCHOOLID = 406;
), --end validCC
validCrsSect (SELECT SECTIONS.ID AS SECTIONID,
SECTIONS.DCID AS SECTIONSDCID,
SECTIONS.EXCLUDEFROMHONORROLL AS SECTHR,
COURSES.COURSE_NUMBER AS COURSE_NUMBER,
COURSES.COURSE_NAME AS COURSE_NAME,
COURSES.EXCLUDEFROMHONORROLL AS CRSHR
FROM SECTIONS
INNER JOIN COURSES ON SECTIONS.COURSE_NUMBER = COURSES.COURSE_NUMBER AND SECTIONS.SCHOOLID = COURSES.SCHOOLID
WHERE SECTIONS.TERMID IN (2700,2701)
AND SECTIONS.SCHOOLID = 406
AND SECTIONS.EXCLUDEFROMHONORROLL = 0
AND COURSES.EXCLUDEFROMHONORROLL = 0
), --end validCrsSect
validStandard (SELECT STANDARDID,
IDENTIFIER,
TRANSIENTCOURSELIST
FROM STANDARD
WHERE isActive = 1
AND YEARID = 27
AND ( instr (STANDARD.identifier, 'MHS.TS', 1 ,1) > 0 --Is a valid standard for this criteria: MHS TS
or STANDARD.identifier = 'MHTC.TS.2' --or MHTC TS
or STANDARD.identifier = 'MHTC.TS.4' )
), --end validStandard
--sgsWithChecks (
SELECT sgs.STANDARDGRADESECTIONID AS SGSID,
sgs.STUDENTSDCID as STUDENTSDCID,
sgs.STANDARDID AS STANDARDID,
sgs.STORECODE AS STORECODE,
sgs.SECTIONSDCID AS SECTIONSDCID,
sgs.YEARID AS YEARID,
sgs.STANDARDGRADE AS STANDARDGRADE,
(select count(CCID) from validCC INNER JOIN STANDARDGRADESECTION sgs ON sgs.STUDENTSDCID = validCC.STUDENTSDCID and sgs.SECTIONSDCID = validCC.SECTIONID) as CC_OK,
(select count(SECTIONID) from validCrsSection INNER JOIN STANDARDGRADESECTION sgs ON sgs.SECTIONSDCID = validCrsSect.SECTIONSDCID) AS CRS_OK,
(select count(STANDARDID) from validStandard INNER JOIN STANDARDGRADESECTION sgs ON sgs.STANDARDID = validStandard.STANDARDID) AS STD_OK
FROM STANDARDGRADESECTION sgs
The purpose of putting the 'OK' columns in the vGrades table is because the final SELECT (not included) goes through and counts up the instances of certain scores filtering by the checks.
Frustratingly, there are two IDs in both the students table and the sections table (and it's not the same data). So when I go to link everything, some tables use ID as the FK, others use DCID as the FK; and I have to pull in an extra table to make that conversion. Makes the joins more fun that way I guess.
Each individual query works on its own, but I can't get the final select count() to work to pull their data. I tried embedding the initial queries as subqueries, but I couldn't pass the studentid into them, and it would run that query for each student, instead of once at the beginning.
My current error is:
Error starting at line : 13 in command -
SECTIONS.DCID AS SECTIONSDCID,
Error report -
Unknown Command
However before it was saying unknown table and referencing the last line of the join statement. All the table names are valid.
Thoughts?
I replaced the INNER JOIN with a simple WHERE condition. This seems to work.
(SELECT COUNT (CCID) FROM validCC WHERE sgs.STUDENTSDCID = validCC.STUDENTSDCID and sgs.SECTIONSDCID = validCC.SECTIONID) as CC_OK,
(SELECT COUNT (SECTIONID) FROM validCrsSect WHERE sgs.SECTIONSDCID = validCrsSect.SECTIONSDCID) AS CRS_OK,
(SELECT COUNT (STANDARDID) FROM validStandard WHERE sgs.STANDARDID = validStandard.STANDARDID) AS STD_OK
I removed the stray comma at the end of validStandard and replaced from validCrsSection with from validCrsSect (assuming it was meant to refer to that WITH clause and there isn't another validCrsSection table). I am also guessing that the counts are meant to be keyed to the current sgs row and not counts of the whole table. I make it this:
with validcc as
( select cc.id as ccid
, cc.studentid
, cc.sectionid
, cc.termid
, st.dcid as studentsdcid
from cc
join students st on st.id = cc.studentid
where cc.termid in (2700, 2701)
and cc.schoolid = 406
)
, validcrssect as
( select s.id as sectionid
, s.dcid as sectionsdcid
, s.excludefromhonorroll as secthr
, c.course_number
, c.course_name
, c.excludefromhonorroll as crshr
from sections s
join courses c
on c.course_number = s.course_number
and c.schoolid = s.schoolid
where s.termid in (2700, 2701)
and s.schoolid = 406
and s.excludefromhonorroll = 0
and c.excludefromhonorroll = 0
)
, validstandard as
( select standardid
, identifier
, transientcourselist
from standard
where isactive = 1
and yearid = 27
and ( instr(standard.identifier, 'MHS.TS', 1, 1) > 0
or standard.identifier in ('MHTC.TS.2','MHTC.TS.4') )
)
select sgs.standardgradesectionid as sgsid
, sgs.studentsdcid
, sgs.standardid
, sgs.storecode
, sgs.sectionsdcid
, sgs.yearid
, sgs.standardgrade
, ( select count(*) from validcc
where validcc.studentsdcid = sgs.studentsdcid
and validcc.sectionid = sgs.sectionsdcid ) as cc_ok
, ( select count(*) from validcrssect
where validcrssect.sectionsdcid = sgs.sectionsdcid ) as crs_ok
, ( select count(*) from validstandard
where validstandard.standardid = sgs.standardid ) as std_ok
from standardgradesection sgs;
This works with the six table definitions reverse-engineered as:
create table students
( id integer not null
, dcid integer );
create table cc
( id integer
, studentid integer
, sectionid integer
, termid integer
, schoolid integer );
create table courses
( course_number integer
, course_name varchar2(30)
, excludefromhonorroll integer
, schoolid integer );
create table sections
( id integer not null
, dcid integer
, excludefromhonorroll integer
, termid integer
, schoolid integer
, course_number integer );
create table standard
( standardid integer
, identifier varchar2(20)
, transientcourselist varchar2(50)
, isactive integer
, yearid integer );
create table standardgradesection
( standardgradesectionid integer
, studentsdcid integer
, standardid integer
, storecode integer
, sectionsdcid integer
, yearid integer
, standardgrade integer );

Simple condition break down query optimizer and its performance

I have a simple query:
select top 10 *
FROM Revision2UploadLocations r2l
inner join Revisions r on r2l.RevisionId = r.Id
INNER JOIN [Databases] [D] on [R].[DatabaseId] = [D].[Id]
INNER JOIN [SqlServers] [S] on [D].[InstanceId] = [S].[Id]
where --r.ValidationStatus in (2, 3) and
r2l.[ChecksumWasSent] = 0 AND r2l.Status = 2
This query is usually executed for 0.5s:
But the same query with uncommented condition is executed for 5s (!!!) and have a very strange execution plan (Revisions and SqlServers are joined although they have no linked columns and the most selective condition "r2l.[ChecksumWasSent] = 0 AND r2l.Status = 2" is executed at the end of query processing:
ValidationStatus is ordinary int not null column.
Columns Revision2UploadLocations.RevisionId, Revisions.DatabaseId, Databases.InstanceId are indexed.
Here is description of tables:
CREATE TABLE [SqlServers]
(
[Id] int identity(1,1) NOT NULL CONSTRAINT PK_SqlServers PRIMARY KEY,
...
)
CREATE TABLE [Databases](
[Id] int identity(1,1) NOT NULL CONSTRAINT PK_Databases PRIMARY KEY,
[InstanceId] int NOT NULL,
[Name] nvarchar(128) NOT NULL,
...
CONSTRAINT FK_Databases_SqlServers FOREIGN KEY ([InstanceId]) REFERENCES [SqlServers]([Id])
)
CREATE INDEX [IX_Databases_DatabaseId] ON [Databases] ([InstanceId] ASC)
CREATE TABLE [Revisions]
(
[Id] int identity(1, 1) NOT NULL,
[DatabaseId] int NOT NULL,
[BackupStatus] tinyint NOT NULL,
[ValidationStatus] tinyint NOT NULL,
...
CONSTRAINT PK_Revisions PRIMARY KEY([Id]),
CONSTRAINT FK_Revisions_Databases FOREIGN KEY ([DatabaseId]) REFERENCES [Databases]([Id])
)
CREATE INDEX [IX_Revisions_DatabaseId] ON [Revisions] ([DatabaseId] ASC)
CREATE TABLE [Revision2UploadLocations]
(
[Id] int NOT NULL IDENTITY (1, 1) CONSTRAINT PK_Revision2UploadLocations PRIMARY KEY,
[Status] int NOT NULL,
RevisionId int NOT NULL,
[ChecksumWasSent] bit NOT NULL,
CONSTRAINT FK_r2l_Revisions FOREIGN KEY ([RevisionId]) REFERENCES [Revisions]([Id])
)
CREATE INDEX [IX_Revision2UploadLocations_RevisionId] ON [Revision2UploadLocations] ([RevisionId] ASC)
How I can improve performance of this query?
EDIT Now I have some more details:
Some tables (SqlServers and Databases) have 1-10 records, but Revisions and Revision2UploadLocations) have 500K+ records, so query optimize decide to use full scan instead index search for small tables and take it first.
Query Performance Tuning (SQL Server Compact):
A small table is one whose contents fit in one or just a few data pages. Avoid indexing very small tables because it is typically more efficient to do a table scan.
As a temprary solution I tried to use query hint FORCE ORDER: Query Hint (SQL Server Compact)
and response time decreased from 5sec to 0.5sec.
But I don't think that it's a good solution.
The Geoffrey's solution doesn't give you the expected result.
The first statement selects 10 rows without garanties that their r.ValidationStatus are 2 or 3. So finaly, you can get less than 10 rows (or even no rows at all).
I think you can rewrite you query as this:
SELECT top 10 *
FROM Revisions r
INNER JOIN Revision2UploadLocations r2l
ON r2l.RevisionId = r.Id
AND r2l.[ChecksumWasSent] = 0
AND r2l.Status = 2
INNER JOIN [Databases] [D] on [D].[Id] = [R].[DatabaseId]
INNER JOIN [SqlServers] [S] on [S].[Id] = [D].[InstanceId]
WHERE r.ValidationStatus in (2, 3)
And if r2l.[ChecksumWasSent] datatype is bit (boolean) with :
more 0 than 1, you can create an index on RevisionId + Status
very much more 1 than 0, you can create and inde RevisionId + ChecksumWasSent + Status
I have found in the past if I insert first to a temp table the first part of your query, with the field you want to further filter on ("ValidationStatus"], then query your temp table the performance/speed is much better.
So the initial query would be this:
select *
into #tmp
FROM Revision2UploadLocations r2l
inner join Revisions r on r2l.RevisionId = r.Id
INNER JOIN [Databases] [D] on [R].[DatabaseId] = [D].[Id]
INNER JOIN [SqlServers] [S] on [D].[InstanceId] = [S].[Id]
where --r.ValidationStatus in (2, 3) and
r2l.[ChecksumWasSent] = 0 AND r2l.Status = 2
then the final select would be:
select * from #tmp
where ValidationStatus in (2,3)
No need for indexes, and I know its weird how the optimizer doesn't always work but this approach has been useful to me several times in the past.

how to increase oracle query when 22,000,000 records in single table?

I have oracle 10g installed on windows server 2003.
I have 22,000,000 records in single table and this is a transactional table,
increasing of records in same table approx. 50,000 per month.
My question is that whenever I run query on it always my query too slow.
Is there any method by which I can improve the performance of the query, like partitioning the table or else?
the query is
select a.prd_code
, a.br_code||'-'||br_title
, a.size_code||'-'||size_title
,size_in_gms
, a.var_code||'-'||var_title
, a.form_code||'-'||form_title
, a.pack_code||'-'||pack_title
, a.pack_type_code||'-'||pack_type_title
, start_date
, end_date
, a.price
from prices a
, brand br
, (select distinct prd_code
, br_code
, size_code
, var_code
, form_code
,packing_code
, pack_type_code
from cphistory
where prd_code = '01'
and flag = 'Y'
and project_yy = '2009' and '01' and '10') cp
, (select prd_code
, br_code
, size_code
, size_in_gms
from sizes
where prd_code = '01'
and end_date = '31-dec-2050'
and flag = 'Y') sz
, (select prd_code
, br_code
, var_code
, var_title
from varient) vt
, (select prd_code
, br_code
, form_code
, form_title
from form) fm
, (select prd_code
, pack_title
from package) pc
, (select prd_code
, pack_type_title
from pakck_type) pt
where a.prd_code = br.prd_code
and a.br_code = br_br_code
and a.prd_code = sz.prd_code
and a.br_code = sz.br_code
and a.size_code = sz.size_code
and a.prd_code = vt.prd_code
and a.br_code = vt.br_code
and a.var_code = vt.var_code
and a.prd_code = fm.prd_code
and a.br_code = fm.br_code
and a.form_code = fm.form_code
and a.prd_code = pc.prd_code
and a.br_code = pc.br_code
and a.pack_code = pc.pack_code
and a.prd_code = pt.prd_code
and a.pack_type_code = pt.pack_type_code
and end_date = '2009'
and prd_code = '01'
order by a.prd_code
, a.br_code
, a.size_code
, a.var_code
, a.pack_code
, a.form_code
tables used in this query are:
prices : has more than 2.1M rows
cphistory : has more than 2.2M rows
sizes : has more than 5000 rows
brand : has more than 1200 rows
varient : has more than 1800 rows
package : has more than 200 rows
pack_type : has more than 150 rows
Check indexes. Make sure you have a primary key. Alternate candidate keys should have unique constraints and indexes.
Run EXPLAIN PLAN on queries and see how the optimizer is running them. If you see TABLE SCAN, add indexes.
Make sure the optimizer is using statistics to make its job easier.
Move historical data into warehouses if you must.
22M records isn't that enormous.
You should probably start with explain plans, but do this:
Take each of the queries in the
"FROM" clause out and do an explain
plan on each one. Verify that each
one is hitting an appropriate index.
If not then add indexes for each of
them so that each of the sub-queries
is fast.
(only if lots of data is returned )
Take out the order by from the main
query and run it. See if it is a
lot faster. If that is the case then
your time is spent sorting the data
and you need to look into why you
are having a slow sort.
Pull out the sub-queries. "vt",
"fm", "pc" and "pt" are taking the
entire tables in the sub queries.
When I test with this putting in
sub-queries like this causes 10g to
miss the indexes on the table
completely. Just put the tables
into the from and the the oracle
optimizer use indexes.
Try folding in all the criteria on
"cp" and "sz" into the main query
and remove the sub-queries and see
if that makes a difference.
Lots and lots of explain plans and more than a little careful though. I wish that I could help more.
Why are the queries slow? Are they doing table scans on the large table? Normally, OLTP queries would be fetching a relatively small number of rows based on a primary key or other indexed column. If your queries are not using indexes and they are the typical sort of OLTP queries that would benefit from using indexes, that would be the place to start.
If you regularly need to query a large number of rows from this table, such that a table scan would be the more efficient access path, you could look into either using materialized views to pre-aggregate the data or into partitioning the table. Partitioning, however, is an extra cost option on top of your enterprise edition license, so you'll generally want to exhaust your other options before going down that path.

Resources