CTX index in oracle is quite slow? - oracle

I am using CTX index in oracle column .
I have two table
xyz have 6 million record.
abc has 100k record
What i want to achieve is this.
Scenario - I have to union them and then i need the data in order by of date . And then i need pagination record on them.
select page_loc, blurb_id, article_id, hit_date,val_rank from (
SELECT REGEXP_REPLACE(regexp_replace(page_loc, '^.*mercer\.com'),'\?.*$') AS page_loc,
blurb_id, article_id, hit_date,RANK() OVER (ORDER BY hit_date DESC) AS val_rank
from (
SELECT page_loc, blurb_id, article_id, hit_date
FROM abc u INNER JOIN person p ON u.person_id=p.person_id
WHERE (CONTAINS(u.page_loc,v_pattern) > 0 )
UNION ALL
SELECT e page_loc, blurb_id, article_id, hit_date
FROM xyz u INNER JOIN person p ON u.person_id=p.person_id
WHERE ( CONTAINS(u.page_loc,v_pattern) > 0)
) p
left join
company c on c.company_id = p.company_id
) a
where val_rank between (v_page_no -1) * 10 and v_page_no * 10
v_pattern --> searching text
v_page_no --> page count
It take long time when i have record for some v_pattern more e.g 100k,
for example a word have 2 million record. it take more than 20 min.
Any idea what can i do . or it is not possible with oracle.
NOTE :- when i analysis it i found order by and pagination making it slow . How can i overcome that

Related

How to Delete these Records in Netezza(Aginity)

I Written a Query to Identify the Duplicate records.
As Below.
WITH DUPS AS
(SELECT A_SURVEYID,
CAST(e_responsedate AS DATE) AS e_responsedate,
E_LG_VM_SURVEY_TYPE_ENUM
FROM TRANSIENT..INTERIM_NPS_SURVEY_MOBILE_RESULTS_20170909 a
GROUP BY A_SURVEYID,
CAST(e_responsedate AS DATE),
E_LG_VM_SURVEY_TYPE_ENUM
HAVING COUNT(*) > 1
),
RANKED AS
(SELECT R.DRS_RECORD_ID,
R.A_SURVEYID,
R.e_responsedate ,
ROW_NUMBER() OVER ( PARTITION BY R.A_SURVEYID, CAST(R.e_responsedate AS DATE),
R.E_LG_VM_SURVEY_TYPE_ENUM ORDER BY SUBSTR(R.DRS_RECORD_ID, INSTR(':', R.DRS_RECORD_ID, 37) + 1, 14) DESC,
SUBSTR(R.DRS_RECORD_ID, INSTR(':', R.DRS_RECORD_ID, 32) + 1, 4) ASC ) AS DR
FROM TRANSIENT..INTERIM_NPS_SURVEY_MOBILE_RESULTS_20170909 R
INNER JOIN DUPS
ON R.A_SURVEYID = DUPS.A_SURVEYID
AND CAST(R.e_responsedate AS DATE) = DUPS.e_responsedate
AND R.E_LG_VM_SURVEY_TYPE_ENUM = DUPS.E_LG_VM_SURVEY_TYPE_ENUM
)
SELECT *
FROM TRANSIENT..INTERIM_NPS_SURVEY_MOBILE_RESULTS_20170909 F
INNER JOIN RANKED
ON F.DRS_RECORD_ID = RANKED.DRS_RECORD_ID
WHERE RANKED.DR > 1
--
By Using the Above Query am able to retrieve the records.(some 6000+ ).
But am unable to delete those records.
Can you please help me on this.
Regards,
Krish
You are very close. In the last 5 lines, do this instead:
Delete from FROM TRANSIENT..INTERIM_NPS_SURVEY_MOBILE_RESULTS_2017090
Where DRS_RECORD_ID in (
Select DRS_RECORD_ID from RANKED WHERE DR>1)
Should work :)
BTW: I'm pretty sure this can be done with less code, but that is not important...

Oracle - What is the order of statement when using functions for filter in where clause

Refer Table WORKQUEUELOG with columns (ID,QueueTypeId,CreateDateTime,WorkId,InQ,OutQ)
There is an index available for (QueueTypeId,CreateDateTime,Id)
When trying to read the first X number of records with following query it is going for a full table search.
OPEN a_CursorHandle FOR
SELECT * FROM (
SELECT * FROM WORKQUEUELOG tbl
WHERE EX_CLS_WQLOG.FILTER(
tbl.QueueTypeId,
tbl.CreateDateTime,
NULL) = 1
AND EX_CLS_WQLOG.VALIDATION(
tbl.ID ,
tbl.WorkId ,
tbl.InQ ) = 1
ORDER BY tbl.QueueTypeId ASC,tbl.CreateDateTime ASC,tbl.Id ASC)
WHERE ROWNUM <= a_MaxCount;
We changed the query as following and started using the index for reading.
ie Validaiton function first and filter function as second.
OPEN a_CursorHandle FOR
SELECT * FROM (
SELECT * FROM WORKQUEUELOG tbl
WHERE EX_CLS_WQLOG.VALIDATION(
tbl.ID ,
tbl.WorkId,
tbl.InQ ) = 1
AND EX_CLS_WQLOG.FILTER(
tbl.QueueTypeId,
tbl.CreateDateTime,
NULL) = 1
ORDER BY tbl.QueueTypeId ASC,tbl.CreateDateTime ASC,tbl.Id ASC)
WHERE ROWNUM <= a_MaxCount;
What is the order of processing the SQL statement in oracle .? left to right or right to left.
From the above example this seems to be right to left . is this a consistent behavior. Is there any other efficient way to write this query.?

In Elasticsearch, how can I establish join query with conditions and later perform percentile and count functions?

I have set of tables in my data base like table A which has set of set of categories , table B set of repositeries. A and B are related by categoryid. And then table C which has set of properties for a repoId. Table C and A are associated with repoId.
Table C can have multiple values for a repoId.
The data in C table is like a property say a number string like 12345XXXX (max data of 10 characters) and I have to find the top 6 matching characters of a particular value in table C and the count of repoIds associated with those top 6 value for a particular data in table A (categoryid).
Table A(set of categories ) ---------> Table B (set of repositories, associated with A with categoryid)---------> Table V (set of FMProperties against a repoId)
Now currently, this has been achieved by using joins and substring queries on these tables and it is very slow.
I have to achieve this functionality using Elastic search. I dont have clear view how to start?
Do I create separate documents / indexes for table A , B and C or fetch the info using sql query and create a single document.
And how we can apply this analytics part explained above.
I am very new and amateur in this technology but I am following the tutorials provided at elasticsearch site.
PFB the query in mysql for this logic:-
select 'fms' as fmstype, C.fmscode as fmsCode,
count(C.repoId) as countOffms from tableC C, tableB B
where B.repoId = C.repoId and B.categoryid = 175
group by C.fmscode
order by countOffms desc
limit 1)
UNION ALL
(select 'fms6' as fmstype, t1.fmscode, t2.countOffms from tableC t1
inner join
(
select substring(C.fmscode,1,6) as first6,
count(C.repoId) as countOffms from tableC C, tableB B
where B.repoId = C.repoId and B.categoryid = 175 and length(C.fmscode) = 6
group by substring(C.fmscode,1,6) order by countOffms desc
limit 1 ) t2
ON
substring(t1.fmscode,1,6) = t2.first6 and length(t1.fmscode) = 6
group by t1.fmscode
order by count(t1.fmscode) desc
limit 1)

Counting the total number of rows depending on a column value

I have a query that should count the total number of rows returned depending on a column value. For example:
As you can see, the M field should display the total number of rows returned which should be 5 because the FT_LOT are all the same value. Here is the query that I have so far:
SELECT DISTINCT
VBATCH_ID, MAXIM_PN, BAGNUMBER, FT_LOT
, m
, level as n
FROM
(
SELECT
VBATCH_ID, MAXIM_PN, BAGNUMBER, FT_LOT, QTY, DC, PRINTDATE, WS_GREEN, WS_PNR, WS_PCN, MSL, BAKETIME, EXPTIME
, una
, dulo
, (dulo - una) + 1 AS m
FROM
(
SELECT c.containername VBATCH_ID
,pb.productname MAXIM_PN
,bn.wipdatavalue BAGNUMBER
,ln.wipdatavalue FT_LOT
,aw.wipdatavalue QTY
,DECODE(ln.wipdatavalue,la.attr_081,la.attr_083
,la.attr_085,la.attr_087
,la.attr_089,la.attr_091
,la.attr_093,la.attr_095
,la.attr_097,la.attr_099
,la.attr_101,la.attr_103
,la.attr_105,la.attr_107
,la.attr_109,la.attr_111
,la.attr_113,la.attr_116
,la.attr_117,la.attr_119
) DC
,TO_CHAR(SYSDATE,'MM/DD/YYYY HH:MI:SS PM') PRINTDATE
,DECODE(UPPER(la.attr_158),'GREEN','HF',NULL) WS_GREEN
,DECODE(la.attr_140,NULL,NULL,'PNR') WS_PNR
,DECODE(la.Attr_080,NULL,NULL,'PCN') WS_PCN
,p.attr_011 MSL
,P.attr_013 BAKETIME
,p.attr_014 EXPTIME
, CASE
WHEN INSTR(bn.wipdatavalue, '-') = 0 THEN
bn.wipdatavalue
ELSE
SUBSTR(bn.wipdatavalue, 1, INSTR(bn.wipdatavalue, '-')-1)
END AS una
, CASE
WHEN INSTR(bn.wipdatavalue, '-') = 0 THEN
bn.wipdatavalue
ELSE
SUBSTR(bn.wipdatavalue, INSTR(bn.wipdatavalue, '-') + 1)
END AS dulo
FROM Container C
JOIN a_lotattributes la ON c.lotattributesid = la.lotattributesid
JOIN product p ON c.productid=p.productid
JOIN productbase pb ON p.productbaseid=pb.productbaseid
JOIN a_adhocwipdatarecord a ON a.objectrefid=c.containerid
JOIN a_adhocwipdatarecorddetails bn ON a.adhocwipdatarecordid=bn.adhocwipdatarecordid AND bn.wipdatanamename ='TR_BAG_NUMBER'
LEFT JOIN a_adhocwipdatarecorddetails ln ON a.adhocwipdatarecordid=ln.adhocwipdatarecordid AND ln.wipdatanamename ='TR_FT_LOT NUMBER'
LEFT JOIN a_adhocwipdatarecorddetails aw ON A.adhocwipdatarecordid=aw.adhocwipdatarecordid AND aw.wipdatanamename ='TR_FT LOT QTY'
WHERE ln.wipdatavalue = :ftlot AND bn.wipdatavalue LIKE :wip
)
) WHERE level LIKE :n
CONNECT BY LEVEL <= m
ORDER BY BAGNUMBER
Thanks for helping out guys.
Actually GROUP BY is not the solution. Having looked again at your desired output I have realised that what you want is an analytic count.
Your posted query is a bit of a mess and, sorry ,but I'm not prepared to invest time in it. This is the sort of structure you need:
select vbatch_id, maxim_pn, bagnumber, ft_lot
, count(*) over (partition by ft_lot) m
from whatever ...
Find out more.
Not sure why you need the DISTINCT. DISTINCT almost always indicates a failure to get the WHERE clause right.

Query that uses Clustered Index Scan instead of seek

I have the following query that returns < 300 results. It is currently taking about 4 seconds to complete, and when I look at the execution plan, it shows that it is spending 41% of resources on a clustered index scan. My limited knowledge of database administration suggests that a clustered index seek would improve performance. How can I get the query to use a clustered index seek instead of a clustered index scan? Below is the pertinent information and the query.
Sql Server 2008 R2
Table PMDME approx 140,000 rows (this is the one that is taking up 41% of resources)
Server Hardware: 16 core 2.7gz processors, 48gb ram
DECLARE #start date, #end date
SET #start = '2013-01-01'
SET #end = CAST(GETDATE() AS DATE)
SELECT
b.total,
c.intakes,
d.ships,
a.CODE_,
RTRIM(a.NAME_) as name,
f.employee as Salesperson,
g.referral_type_id,
h.referral_type,
e.slscode,
a.city,
a.STATE_,
a.zip
FROM PACWARE.ADS.RFDME a
LEFT OUTER JOIN (
SELECT SUM(b.quantity) total, a.ref_id from event.dbo.sample a
JOIN event.dbo.sample_parts b on a.id = b.sample_id
JOIN PACWARE.ADS.PTDME c on b.part_id = c.CODE_
WHERE c.MEDICAREID = 'E0607' AND a.order_date between #start and #end
GROUP BY a.ref_id
)b on a.CODE_ = b.ref_id
LEFT OUTER JOIN (
SELECT COUNT(a.CODE_)as intakes, rfcode
FROM PACWARE.ADS.PMDME a
WHERE a.REGDATETIME BETWEEN #start and #end
GROUP BY a.RFCODE
) c on a.CODE_ = c.rfcode
LEFT OUTER JOIN (
SELECT
COUNT(a.CODE) as ships, b.rfcode
FROM
(
SELECT
A.ACCOUNT AS CODE,
MIN(CAST(A.BILLDATETIME AS DATE)) AS SHIPDATE
FROM PACWARE.ADS.ARODME A
LEFT OUTER JOIN PACWARE.ADS.PTDME B ON A.PTCODE=B.CODE_
LEFT OUTER JOIN event.dbo.newdate() D ON A.ACCOUNT=D.ACCOUNT
LEFT OUTER JOIN event.dbo.newdate_extras() D2 ON A.ACCOUNT=D2.ACCOUNT
WHERE A.BILLDATETIME>=#start
AND A.BILLDATETIME=#start AND D.NEWDATE=#start AND D2.NEWDATE'ID'
Group by
A.ACCOUNT,
B.MEDICAREID,
A.CATEGORY
) a
JOIN PACWARE.ADS.PMDME b on a.CODE = b.CODE_
GROUP BY b.RFCODE
) d on a.CODE_ = d.rfcode
LEFT OUTER JOIN event.dbo.employee_slscode e on a.SLSCODE = e.slscode
JOIN event.dbo.employee f on e.employee_id = f.id
JOIN event.dbo.referral_data g on a.CODE_ = g.CODE_
JOIN event.dbo.referral_type h on g.referral_type_id = h.id
WHERE total > 0
I would try creating first and index just for the colum REGDATETIME on PACWARE.ADS.PMDME table.
GO
CREATE NONCLUSTERED INDEX [IX_PMDME_REGDATETIME] ON PACWARE.ADS.PMDME
(
[REGDATETIME] ASC
)
GO
Test how it works. I would also test adding another index to the column RFCODE (same table) if the selectivity of the column is good enough.

Resources