I have a huge issue with query performance at my postgresql database.
The version of postgresql is: "PostgreSQL 8.4.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070115 (prerelease) (SUSE Linux), 64-bit"
I have config file set as follows:
shared_buffers = 8GB
effective_cache_size = 24GB
work_mem = 419430kB
maintenance_work_mem = 2GB
checkpoint_segments = 128
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500
constraint_exclusion = on
The query which I need to execute is:
SELECT events.evt_id, events.evt_time, events.device_evt_time , to_ip_char(events.sip) , evt_agent.port , events.rv40 , events.evt , events.msg , events.sun , events.rv35 , events.dun , events.rv45 , events.fn , events.dp , events.trgt_trust_name , events.trgt_trust_domain , events.rv36 , events.rv43 , events.cv21 , events.cv40 , events.cv41 , events.cv42 , events.cv43 , events.cv44 , events.cv50 , events.cv51 , events.cv52 , events.cv53 , events.cv54 , events.cv55 , events.cv56 , events.cv35 , events.cv60 , events.cv61 , events.cv62
FROM events, evt_agent
WHERE
events.agent_id = evt_agent.agent_id AND (evt_agent.port::text = ANY (ARRAY['x'::character varying, 'y'::character varying, 'z'::character varying]::text[]))
AND events.evt::text <> 'Internal Message'::text
AND event_time > '2015-12-31 13:23:55.767+00'::timestamptz limit 10;
ORDER BY events.evt_time;
Table events is a partitioned table, one partition for each day.
Each partition has two constrains:
Name events_p_YYYYMMDDHHMISS_events_p_max_pk
Columns evt_time, evt_id
Name events_p_YYYYMMDDHHMISS_dc
Definition evt_time > '2015-12-04 13:24:25.267973+00'::timestamp with time zone AND evt_time <= '2015-12-05 13:24:25.267973+00'::timestamp with time zone
and seven indexes:
Name events_p_YYYYMMDDHHMISS_events_p_max_identity_ix1
Columns evt_time, init_usr_identity_guid, rid02
Operator classes timestamptz_ops, uuid_ops, int8_ops
Name events_p_YYYYMMDDHHMISS_events_p_max_identity_ix2
Columns evt_time, trgt_usr_identity_guid, rid02
Operator classes timestamptz_ops, uuid_ops, int8_ops
Name events_p_YYYYMMDDHHMISS_events_p_max_ix1
Columns evt_time, sev, agent_id
Operator classes timestamptz_ops, int4_ops, int8_ops
Name events_p_YYYYMMDDHHMISS_events_p_max_ix2
Columns evt_time, dip, sev
Operator classes timestamptz_ops, int4_ops, int4_ops
Name events_p_YYYYMMDDHHMISS_events_p_max_ix3
Columns evt_time, res, sev
Operator classes timestamptz_ops, text_ops, int4_ops
Name events_p_YYYYMMDDHHMISS_events_p_max_ix4
Columns evt_time, sip, sev
Operator classes timestamptz_ops, int4_ops, int4_ops
Name events_p_YYYYMMDDHHMISS_events_p_max_ix5
Columns evt_time, txnmy_id, agent_id
Operator classes timestamptz_ops, int8_ops, int8_ops
Table evt_agent is a dictionary table with about 20-30 rows only.
constrains:
Name evt_agent_pk
Columns agent_id
indexes:
Name evt_agent_ak1
Columns agent, port, rn, pn, sn, st, device_ctgry, src_id, cust_id
Operator classes text_ops, text_ops, text_ops, text_ops, text_ops, text_ops, text_ops, uuid_ops, int8_ops
Name evt_agent_ix1
Columns device_ctgry, agent_id
Operator classes text_ops, int8_ops
When I execute the query I got this explain plan:
http://explain.depesz.com/s/3xcu
Name evt_agent_ix2
Columns st, agent_id
Operator classes text_ops, int8_ops
Name test_ev_ag_indx1
Columns agent_id
Operator classes int8_ops
I thought there is a problem with statistics so I "vaccuum analyze" all related tables, but no improvement in the query performance, explain plan.
I tried with "inner query" trick, explain plan is here:
http://explain.depesz.com/s/FAkH
From what I can understand it got kind of worse.
Do you have nay idea how to get better execution plan for this query?
Now it takes about 42 minutes to get any results from the query.
Thanks in advance!
We will have to try and eliminate the hash join. A lateral join might just be the answer:
SELECT events.evt_id, events.evt_time, events.device_evt_time , to_ip_char(events.sip) , evt_agent.port , events.rv40 , events.evt , events.msg , events.sun , events.rv35 , events.dun , events.rv45 , events.fn , events.dp , events.trgt_trust_name , events.trgt_trust_domain , events.rv36 , events.rv43 , events.cv21 , events.cv40 , events.cv41 , events.cv42 , events.cv43 , events.cv44 , events.cv50 , events.cv51 , events.cv52 , events.cv53 , events.cv54 , events.cv55 , events.cv56 , events.cv35 , events.cv60 , events.cv61 , events.cv62
FROM evt_agent,
LATERAL (SELECT *
FROM events AS e
WHERE
e.agent_id = evt_agent.agent_id
AND e.evt::text <> 'Internal Message'::text
AND e.event_time > '2015-12-31 13:23:55.767+00'::timestamptz
LIMIT 10000) AS events
WHERE (evt_agent.port::text = ANY (ARRAY['x'::character varying, 'y'::character varying, 'z'::character varying]::text[]))
ORDER BY events.evt_time
LIMIT 10000;
Please try it with and without the inner limit and post the explain plans for both.
Related
Spoiler alert: I am fairly new to Oracle.
I have four tables: enrollments, courses/sections, standards, and grades.
We are running Honor Roll. I have queries on the first three tables that add various constraints needed to meet honor roll requirements. Then we look at the grades table. If they have a valid enrollment, in a valid course, meeting valid standards, then count up their scores. If their score qty meets thresholds, then they get Honors.
This code is not optimized, and likely can be done in a far better/more compact way I'm sure -- however, it only gets run a few times a year, so I'm willing to trade off optimization in order to increase human readability, so that I can continue to learn the fundamentals. So far I have:
WITH validCC (SELECT CC.ID AS CCID,
CC.STUDENTID AS STUDENTID,
CC.SECTIONID AS SECTIONID,
CC.TERMID AS TERMID,
STUDENTS.DCID AS STUDENTSDCID
FROM CC
INNER JOIN STUDENTS ON CC.STUDENTID = STUDENTS.ID
WHERE TERMID in (2700,2701)
AND CC.SCHOOLID = 406;
), --end validCC
validCrsSect (SELECT SECTIONS.ID AS SECTIONID,
SECTIONS.DCID AS SECTIONSDCID,
SECTIONS.EXCLUDEFROMHONORROLL AS SECTHR,
COURSES.COURSE_NUMBER AS COURSE_NUMBER,
COURSES.COURSE_NAME AS COURSE_NAME,
COURSES.EXCLUDEFROMHONORROLL AS CRSHR
FROM SECTIONS
INNER JOIN COURSES ON SECTIONS.COURSE_NUMBER = COURSES.COURSE_NUMBER AND SECTIONS.SCHOOLID = COURSES.SCHOOLID
WHERE SECTIONS.TERMID IN (2700,2701)
AND SECTIONS.SCHOOLID = 406
AND SECTIONS.EXCLUDEFROMHONORROLL = 0
AND COURSES.EXCLUDEFROMHONORROLL = 0
), --end validCrsSect
validStandard (SELECT STANDARDID,
IDENTIFIER,
TRANSIENTCOURSELIST
FROM STANDARD
WHERE isActive = 1
AND YEARID = 27
AND ( instr (STANDARD.identifier, 'MHS.TS', 1 ,1) > 0 --Is a valid standard for this criteria: MHS TS
or STANDARD.identifier = 'MHTC.TS.2' --or MHTC TS
or STANDARD.identifier = 'MHTC.TS.4' )
), --end validStandard
--sgsWithChecks (
SELECT sgs.STANDARDGRADESECTIONID AS SGSID,
sgs.STUDENTSDCID as STUDENTSDCID,
sgs.STANDARDID AS STANDARDID,
sgs.STORECODE AS STORECODE,
sgs.SECTIONSDCID AS SECTIONSDCID,
sgs.YEARID AS YEARID,
sgs.STANDARDGRADE AS STANDARDGRADE,
(select count(CCID) from validCC INNER JOIN STANDARDGRADESECTION sgs ON sgs.STUDENTSDCID = validCC.STUDENTSDCID and sgs.SECTIONSDCID = validCC.SECTIONID) as CC_OK,
(select count(SECTIONID) from validCrsSection INNER JOIN STANDARDGRADESECTION sgs ON sgs.SECTIONSDCID = validCrsSect.SECTIONSDCID) AS CRS_OK,
(select count(STANDARDID) from validStandard INNER JOIN STANDARDGRADESECTION sgs ON sgs.STANDARDID = validStandard.STANDARDID) AS STD_OK
FROM STANDARDGRADESECTION sgs
The purpose of putting the 'OK' columns in the vGrades table is because the final SELECT (not included) goes through and counts up the instances of certain scores filtering by the checks.
Frustratingly, there are two IDs in both the students table and the sections table (and it's not the same data). So when I go to link everything, some tables use ID as the FK, others use DCID as the FK; and I have to pull in an extra table to make that conversion. Makes the joins more fun that way I guess.
Each individual query works on its own, but I can't get the final select count() to work to pull their data. I tried embedding the initial queries as subqueries, but I couldn't pass the studentid into them, and it would run that query for each student, instead of once at the beginning.
My current error is:
Error starting at line : 13 in command -
SECTIONS.DCID AS SECTIONSDCID,
Error report -
Unknown Command
However before it was saying unknown table and referencing the last line of the join statement. All the table names are valid.
Thoughts?
I replaced the INNER JOIN with a simple WHERE condition. This seems to work.
(SELECT COUNT (CCID) FROM validCC WHERE sgs.STUDENTSDCID = validCC.STUDENTSDCID and sgs.SECTIONSDCID = validCC.SECTIONID) as CC_OK,
(SELECT COUNT (SECTIONID) FROM validCrsSect WHERE sgs.SECTIONSDCID = validCrsSect.SECTIONSDCID) AS CRS_OK,
(SELECT COUNT (STANDARDID) FROM validStandard WHERE sgs.STANDARDID = validStandard.STANDARDID) AS STD_OK
I removed the stray comma at the end of validStandard and replaced from validCrsSection with from validCrsSect (assuming it was meant to refer to that WITH clause and there isn't another validCrsSection table). I am also guessing that the counts are meant to be keyed to the current sgs row and not counts of the whole table. I make it this:
with validcc as
( select cc.id as ccid
, cc.studentid
, cc.sectionid
, cc.termid
, st.dcid as studentsdcid
from cc
join students st on st.id = cc.studentid
where cc.termid in (2700, 2701)
and cc.schoolid = 406
)
, validcrssect as
( select s.id as sectionid
, s.dcid as sectionsdcid
, s.excludefromhonorroll as secthr
, c.course_number
, c.course_name
, c.excludefromhonorroll as crshr
from sections s
join courses c
on c.course_number = s.course_number
and c.schoolid = s.schoolid
where s.termid in (2700, 2701)
and s.schoolid = 406
and s.excludefromhonorroll = 0
and c.excludefromhonorroll = 0
)
, validstandard as
( select standardid
, identifier
, transientcourselist
from standard
where isactive = 1
and yearid = 27
and ( instr(standard.identifier, 'MHS.TS', 1, 1) > 0
or standard.identifier in ('MHTC.TS.2','MHTC.TS.4') )
)
select sgs.standardgradesectionid as sgsid
, sgs.studentsdcid
, sgs.standardid
, sgs.storecode
, sgs.sectionsdcid
, sgs.yearid
, sgs.standardgrade
, ( select count(*) from validcc
where validcc.studentsdcid = sgs.studentsdcid
and validcc.sectionid = sgs.sectionsdcid ) as cc_ok
, ( select count(*) from validcrssect
where validcrssect.sectionsdcid = sgs.sectionsdcid ) as crs_ok
, ( select count(*) from validstandard
where validstandard.standardid = sgs.standardid ) as std_ok
from standardgradesection sgs;
This works with the six table definitions reverse-engineered as:
create table students
( id integer not null
, dcid integer );
create table cc
( id integer
, studentid integer
, sectionid integer
, termid integer
, schoolid integer );
create table courses
( course_number integer
, course_name varchar2(30)
, excludefromhonorroll integer
, schoolid integer );
create table sections
( id integer not null
, dcid integer
, excludefromhonorroll integer
, termid integer
, schoolid integer
, course_number integer );
create table standard
( standardid integer
, identifier varchar2(20)
, transientcourselist varchar2(50)
, isactive integer
, yearid integer );
create table standardgradesection
( standardgradesectionid integer
, studentsdcid integer
, standardid integer
, storecode integer
, sectionsdcid integer
, yearid integer
, standardgrade integer );
I have an Oracle join query that picks data very slow. It is like 1000 rows for 7 mins. Please could you help in writing the code in a different way so the data is pulled faster. The next steps for it is using the Select values and dumping the data into MySQL table. I am using Pentaho tool here. Thanks
select
null id,
ss.ILOAN_CODE ,
ss.INST_NUM ,
ss.INST_AMT ,
ss.INST_PRINCIPAL ,
ss.INST_INTEREST ,
ss.BALANCE_PRINCIPAL ,
ss.INST_DUE_DATE ,
ss.PAID_FLAG ,
ss.LATE_FEE ,
ss.PAYMENT_DATE ,
ss.INST_AMT_PAID ,
ss.INST_AMT_DUE ,
ss.REV_CHECK_NUM ,
ss.REV_CHECK_AMT ,
ss.CREATED_BY ,
ss.DATE_CREATED ,
ss.UPDATED_BY ,
ss.DATE_UPDATED ,
ss.INST_DAYS ,
ss.MATURED_INTEREST ,
ss.UNPAID_INTEREST ,
ss.ADJ_INST_PRINCIPAL ,
ss.ADJ_INST_AMT ,
ss.ADJ_INST_INTEREST ,
ss.ADJ_BALANCE_PRINCIPAL ,
ss.ADJ_MATURED_INTEREST ,
ss.ADJ_UNPAID_INTEREST ,
ss.IS_PRINTED ,
ss.RTN_FEE_AMT ,
ss.WAIVE_FEE_AMT ,
ss.LATE_FEE_AMT ,
ss.APR_BALANCE_PRINCIPAL ,
ss.ACHDEPOSIT_DATE ,
ss.ACHRETURN_DATE ,
ss.ACHCLEAR_DATE ,
ss.APR_INST_INTEREST ,
ss.APR_UNPAID_INTEREST ,
ss.CSO_FEE ,
ss.MATURED_CSO_FEE ,
ss.UNPAID_CSO_FEE ,
ss.CSO_FEE_BALANCE
from ST_IL_SCHEDULE ss,
ST_IL_MASTER sm,
BO_MASTER bm
where sm.iloan_code = ss.iloan_code
and sm.bo_code = bm.bo_code
and ss.ILOAN_CODE in (select distinct loan_Number from SVP_LOAN_MASTER_INVENTORY)
and ss.ILOAN_CODE in (select distinct loan_Number from SVP_LOAN_MASTER_INVENTORY)
This is candidate for being slow. You don't need distinct here and also please use explicit join for readability.
Try:
Select
null id,
ss.ILOAN_CODE ,
ss.INST_NUM ,
ss.INST_AMT ,
ss.INST_PRINCIPAL ,
ss.INST_INTEREST ,
ss.BALANCE_PRINCIPAL ,
ss.INST_DUE_DATE ,
ss.PAID_FLAG ,
ss.LATE_FEE ,
ss.PAYMENT_DATE ,
ss.INST_AMT_PAID ,
ss.INST_AMT_DUE ,
ss.REV_CHECK_NUM ,
ss.REV_CHECK_AMT ,
ss.CREATED_BY ,
ss.DATE_CREATED ,
ss.UPDATED_BY ,
ss.DATE_UPDATED ,
ss.INST_DAYS ,
ss.MATURED_INTEREST ,
ss.UNPAID_INTEREST ,
ss.ADJ_INST_PRINCIPAL ,
ss.ADJ_INST_AMT ,
ss.ADJ_INST_INTEREST ,
ss.ADJ_BALANCE_PRINCIPAL ,
ss.ADJ_MATURED_INTEREST ,
ss.ADJ_UNPAID_INTEREST ,
ss.IS_PRINTED ,
ss.RTN_FEE_AMT ,
ss.WAIVE_FEE_AMT ,
ss.LATE_FEE_AMT ,
ss.APR_BALANCE_PRINCIPAL ,
ss.ACHDEPOSIT_DATE ,
ss.ACHRETURN_DATE ,
ss.ACHCLEAR_DATE ,
ss.APR_INST_INTEREST ,
ss.APR_UNPAID_INTEREST ,
ss.CSO_FEE ,
ss.MATURED_CSO_FEE ,
ss.UNPAID_CSO_FEE ,
ss.CSO_FEE_BALANCE
from ST_IL_SCHEDULE ss,
inner join ST_IL_MASTER sm on (sm.iloan_code = ss.iloan_code)
inner join BO_MASTER bm on (sm.bo_code = bm.bo_code)
inner join SVP_LOAN_MASTER_INVENTORY slm on (ss.loan_code = slm.loan number)
If that not helps please consider creating indexes on columns used in join.
I need to create a query, that if a certain field is blank or null. I need to do a select statement to another table and retrieve the blank field . Could you please advise on a way to accomplish this. Below is the query. The field in question is BEAT.
SELECT COALESCE(ADDRESSES.BEAT,Incident_addresses.beat)
, COALESCE (ADDRESSES.SUB_BEAT,Incident_addresses.sub_beat)
, ADDRESSES.STREET_NAME
, ADDRESSES.STREET_NUMBER
, ADDRESSES.SUB_NUMBER
, WARRANT_PEOPLE_VW.LNAME
, WARRANT_PEOPLE_VW.FNAME
, WARRANT_PEOPLE_VW.DOB
, WARRANT_PEOPLE_VW.RACE_RACE_CODE
, WARRANT_PEOPLE_VW.SEX_SEX_CODE
, WARRANT_PEOPLE_VW.CASE_NUMBER
, E_WARRANTS.DATE_ISSUED
, E_WARRANTS.TELETYPE_NUMBER
, E_WARRANTS.ORDINANCE_VIOLATION
FROM EJSDBA.ADDRESSES
, POL_LEEAL.E_WARRANTS
, POL_LEEAL.WARRANT_PEOPLE_VW,incident_people,Incident_addresses
WHERE ADDRESSES.ADDRESS_ID =E_WARRANTS.ADDR_ADDRESS_ID
AND E_WARRANTS.WARRANT_ID = WARRANT_PEOPLE_VW.WARRANT_ID
AND WARRANT_PEOPLE_VW.NME_TYP_NAME_TYPE_CODE = 'P'
AND WARRANT_PEOPLE_VW.AGNCY_CD_AGENCY_CODE = 'MCPD'
AND WARRANT_PEOPLE_VW.WSC_CODE='A'
AND EJSDBA.ADDRESSES.ADDRESS_ID= Incident_addresses.ADDRESS_ID
and incident_people.inc_incident_id=Incident_addresses.incident_id
ORDER BY ADDRESSES.BEAT
, ADDRESSES.SUB_BEAT
, ADDRESSES.STREET_NAME
, ADDRESSES.STREET_NUMBER
;
You can embed a correlated subquery into a case expression, but you MUST reference the table inside the subquery to some values of the outer query so that the correct value can be located.
SELECT
COALESCE(ADDRESSES.BEAT, Incident_addresses.beat)
, CASE
WHEN Addresses.BEAT IS NULL THEN (
SELECT
Beat
FROM incidents inner_ref
WHERE ???outer??.incident_id = inner_ref.id
AND rownum = 1)
END AS x
, COALESCE(ADDRESSES.SUB_BEAT, Incident_addresses.sub_beat)
...
and that subquery MUST only return a single value (hence I have used "and rownum = 1".
(In Oracle 12 you could use FETCH FIRST 1 ROW ONLY)
I have a table that has (for example) 4 columns.
pk_table_id INT NOT NULL
username VARCHAR(100) NOT NULL
start_date DATETIME NOT NULL
end_date DATETIME NULL
My requirement is to return all rows in descending order of end_date - BUT the NULL values must be first, and then descending order of start_date.
I've done it in SQL - but could someone assist me with a LINQ version to do this?
This is the SQL query we use:
SELECT [person_employment_id]
, [party_id]
, [employer_name]
, [occupation]
, [telephone]
, [start_date]
, [end_date]
, [person_employment_type_id]
, [person_employment_end_reason_type_id]
, [comments]
, [deleted]
, [create_user]
, [create_date]
, [last_update_user]
, [last_update_date]
, [version]
FROM [dbo].[person_employment]
WHERE ([party_id]=#party_id)
ORDER BY ISNull([end_date],'9999-DEC-31') DESC, [start_date] DESC
For this problem, you could do a null check on the end_date and use that result as the ordering. So you don't need to use the same SQL constructs to achieve this, but rather use one more natural in your language of choice (C# I'm assuming).
var query =
from row in dc.Table
let isEndDateNull = row.end_date == null
orderby isEndDateNull descending, row.start_date descending
select row;
Oracle FAQ defines temp table space as follows:
Temporary tablespaces are used to
manage space for database sort
operations and for storing global
temporary tables. For example, if you
join two large tables, and Oracle
cannot do the sort in memory, space
will be allocated in a temporary
tablespace for doing the sort
operation.
That's great, but I need more detail about what exactly is using the space. Due to quirks of the application design most queries do some kind of sorting, so I need to narrow it down to client executable, target table, or SQL statement.
Essentially, I'm looking for clues to tell me more precisely what might be wrong with this (rather large application). Any sort of clue might be useful, so long as it is more precise than "sorting".
I'm not sure exactly what information you have to hand already, but using the following query will point out which program/user/sessions etc are currently using your temp space.
SELECT b.TABLESPACE
, b.segfile#
, b.segblk#
, ROUND ( ( ( b.blocks * p.VALUE ) / 1024 / 1024 ), 2 ) size_mb
, a.SID
, a.serial#
, a.username
, a.osuser
, a.program
, a.status
FROM v$session a
, v$sort_usage b
, v$process c
, v$parameter p
WHERE p.NAME = 'db_block_size'
AND a.saddr = b.session_addr
AND a.paddr = c.addr
ORDER BY b.TABLESPACE
, b.segfile#
, b.segblk#
, b.blocks;
Once you find out which session is doing the damage, then have a look at the SQL being executed, and you should be on the right path.
Thanks goes for Michael OShea for his answer ,
but in case you have Oracle RAC multiple instances , then you will need this ...
SELECT b.TABLESPACE
, b.segfile#
, b.segblk#
, ROUND ( ( ( b.blocks * p.VALUE ) / 1024 / 1024 ), 2 ) size_mb
, a.inst_ID
, a.SID
, a.serial#
, a.username
, a.osuser
, a.program
, a.status
FROM gv$session a
, gv$sort_usage b
, gv$process c
, gv$parameter p
WHERE p.NAME = 'db_block_size'
AND a.saddr = b.session_addr
AND a.paddr = c.addr
-- AND b.TABLESPACE='TEMP2'
ORDER BY a.inst_ID , b.TABLESPACE
, b.segfile#
, b.segblk#
, b.blocks;
and this the script to generate the kill statements:
Please review which sessions you will be killing ...
SELECT b.TABLESPACE, a.username , a.osuser , a.program , a.status ,
'ALTER SYSTEM KILL SESSION '''||a.SID||','||a.SERIAL#||',#'||a.inst_ID||''' IMMEDIATE;'
FROM gv$session a
, gv$sort_usage b
, gv$process c
, gv$parameter p
WHERE p.NAME = 'db_block_size'
AND a.saddr = b.session_addr
AND a.paddr = c.addr
-- AND b.TABLESPACE='TEMP'
ORDER BY a.inst_ID , b.TABLESPACE
, b.segfile#
, b.segblk#
, b.blocks;
One rule of thumb is that almost any query that takes more than a second probably uses some TEMP space, and these are not the just ones involving ORDER BYs but also:
GROUP BYs (SORT GROUPBY before 10.2 and HASH GROUPBY from 10.2 onwards)
HASH JOINs or MERGE JOINs
Global Temp Tables (obviously)
Index rebuilds
Occasionally, used space in temp tablespaces doesn't get released by Oracle (bug/quirk) so you need to manually drop a file from the tablespace, drop it from the file system and create another one.