Using Cumulative and Percent Rank functions in my query - ranking-functions

I'm working on this query and would like to use the cumulative distribution or percent rank function to provide me with cumulative distribution and the percent rank for field 'leads distribution'. however, it appears that when the decile value is populated it doesn't give me the right number for that decile for example the LeadGrouping record 'FLJeffersonCollateral' should have a decile of 1 but is showing a decile of 2. Can you help what am I doing wrong? Below is the query.
Select c.LeadGrouping
, c.StateID
, c.County
, c.Media_Type
, Sum(c.Leads) as 'Leads'
, Sum(isnull(c.CCPEnroll,0)) as 'CCPEnroll'
, c.CloseRate
--, row_number() over(partition by c.Field1 order by c.CloseRate desc) as line_number
, sum(mx.MaxLeads) as 'MaxLeads' --ALTER TO MAX LEADS OVERALL --sum(mx.MaxLeads)
, CONVERT(DECIMAL(10,10), isnull(Sum(c.leads),0) / convert(numeric, isnull(sum(mx.MaxLeads),0))) as 'LeadDistribution' --CONFIRM TOTAL SUMS TO 100%
, PERCENT_RANK () over (order by Convert (Decimal(10,10), (isnull(Sum(c.leads),0) / convert(numeric, isnull(sum(mx.MaxLeads),0))))) as 'percent rank'
From (Select 'A' as 'Field1'
, b.LeadGrouping
, b.StateID
, b.County
, b.Media_Type
, Sum(b.Leads) as 'Leads'
, Sum(b.CCPEnroll) as 'CCPEnroll'
, CONVERT(DECIMAL(5,2), isnull(Sum(b.CCPEnroll),0) / convert(numeric, isnull(Sum(b.Leads),0))) as 'CloseRate'
From (Select a.LeadGrouping
, a.StateID
, a.County
, a.Media_Type
, Sum(a.Lead) as 'Leads'
, Sum(a.CCPEnroll) as 'CCPEnroll'
From (Select distinct al.LeadID
, al.Interaction_ID
, al.createddate
, al.appointmentdate
, al.C2CExp
, Case When al.EnrollMonth is not null Then DateDiff(day, al.createddate, al.AppDate)
Else DateDiff(day, al.createddate, GETDATE()) End as 'LeadTenureInDays'
, Count(distinct i.interaction_id) as 'LeadInteractionCount'
, l.lead_interactionsource__c
, al.MarketID
, left(al.state_county_key,2) as 'StateID'
, al.County
, al.Media_Type
, al.SegmentName
, al.Correctcampaign
, c.name as 'CampaignName'
, c.lead_program__c
, l.lead_disposition__c
, al.CCPEnroll
, al.PDPEnroll
, al.AppDate
, al.EnrollMonth
, al.writingagentid
, p.ProducerType
, al.Lead
, case When f.fipscode is null Then 'Not In CCP Footprint'
When f.FIPSCode is not null Then 'In CCP Footprint'
else 'ERROR' End as 'Footprint'
, CONCAT(left(al.state_county_key,2),al.county, al.Media_Type) as 'LeadGrouping'
From rknow.dbo.allleadstemp2 al
Left Join rknow.dbo.campaign_sfdc_datalake c on al.correctcampaign = c.id
Left Join rknow.dbo.lead_sfdc_datalake l on al.leadid = l.id
Left Join rknow.dbo.producerstructure_datalake p on al.writingagentid = p.producerid and al.createddate >= p.effectivestartdate and al.createddate <= DateAdd(day, 1, p.effectiveenddate)
Left Join [RKnow].[rpt].[interaction] i on al.leadid = i.lead_id
Inner Join rknow.dbo.fipsspc f on al.fips = f.fipscode --2022 CCP County Level Footprint
Where al.C2CExp >= DateAdd(day,1,getdate())
and al.lead_type = 'ccp' --HD 12/04/2020: Added criteria filter to exclude CrossSell leads
Group by al.LeadID
, al.Interaction_ID
, al.C2CExp
, al.createddate
, al.MarketID
, al.state_county_key
, al.County
, al.Media_Type
, al.SegmentName
, al.Correctcampaign
, al.appointmentdate
, c.name
, c.lead_program__c
, l.lead_disposition__c
, al.CCPEnroll
, al.PDPEnroll
, al.EnrollMonth
, al.writingagentid
, p.ProducerType
, l.lead_interactionsource__c
, al.AppDate
, f.fipscode
, al.Lead) a
Group by a.LeadGrouping
, a.Media_Type
, a.StateID
, a.County) b
Group by b.LeadGrouping
, b.StateID
, b.County
, b.Media_Type) c
Left Join (Select distinct 'A' as 'Field1', sum(mx.lead) as 'MaxLeads'
From(Select distinct al.LeadID
, al.Interaction_ID
, al.createddate
, al.appointmentdate
, al.C2CExp
, Case When al.EnrollMonth is not null Then DateDiff(day, al.createddate, al.AppDate)
Else DateDiff(day, al.createddate, GETDATE()) End as 'LeadTenureInDays'
, Count(distinct i.interaction_id) as 'LeadInteractionCount'
, l.lead_interactionsource__c
, al.MarketID
, left(al.state_county_key,2) as 'StateID'
, al.County
, al.Media_Type
, al.SegmentName
, al.Correctcampaign
, c.name as 'CampaignName'
, c.lead_program__c
, l.lead_disposition__c
, al.CCPEnroll
, al.PDPEnroll
, al.AppDate
, al.EnrollMonth
, al.writingagentid
, p.ProducerType
, al.Lead
, case When f.fipscode is null Then 'Not In CCP Footprint'
When f.FIPSCode is not null Then 'In CCP Footprint'
else 'ERROR' End as 'Footprint'
, CONCAT(left(al.state_county_key,2),al.county, al.Media_Type) as 'LeadGrouping'
From rknow.dbo.allleadstemp2 al
Left Join rknow.dbo.campaign_sfdc_datalake c on al.correctcampaign = c.id
Left Join rknow.dbo.lead_sfdc_datalake l on al.leadid = l.id
Left Join rknow.dbo.producerstructure_datalake p on al.writingagentid = p.producerid and al.createddate >= p.effectivestartdate and al.createddate <= DateAdd(day, 1, p.effectiveenddate)
Left Join [RKnow].[rpt].[interaction] i on al.leadid = i.lead_id
Inner Join rknow.dbo.fipsspc f on al.fips = f.fipscode --2022 CCP County Level Footprint
Where al.C2CExp >= DateAdd(day,1,getdate())
and al.lead_type = 'ccp' --HD 12/04/2020: Added criteria filter to exclude CrossSell leads
Group by al.LeadID
, al.Interaction_ID
, al.C2CExp
, al.createddate
, al.MarketID
, al.state_county_key
, al.County
, al.Media_Type
, al.SegmentName
, al.Correctcampaign
, al.appointmentdate
, c.name
, c.lead_program__c
, l.lead_disposition__c
, al.CCPEnroll
, al.PDPEnroll
, al.EnrollMonth
, al.writingagentid
, p.ProducerType
, l.lead_interactionsource__c
, al.AppDate
, f.fipscode
, al.Lead) mx) mx on mx.Field1 = c.Field1
Group by c.LeadGrouping
, c.StateID
, c.County
, c.Media_Type
, c.CloseRate
, c.Field1

Related

Return only one row of a query selection

I am writing a data export where I need to return one row from a selection where there may be multiple rows. In this case, the second table is the telephone_current table. This table includes a row for several telephone types (CA, MA, PR, etc.), and they are not in any particular order. If the individual has a CA, I need to include that record; if not, then I would use either type MA or PR.
The query below works, technically, but it will run excruciatingly slow (10 minutes or more).
I need advice to fix this query to get one row (record) per individual. The slowdown occurs when I include the self join telephone_current tc. Note. I've also moved the AND into the WHERE clause, which runs with the time delay.
SELECT distinct igp.isu_id PersonnelNumber
, igp.preferred_first_name FirstName
, igp.current_last_name LastName
, NULL Title
, igp.current_mi MiddleInitial
, pd.email_preferred_address
, tc.phone_number_combined
, igp.isu_username networkID
, '0' GroupID
, e.home_organization_desc GroupName
, CASE
WHEN substr(e.employee_class,1,1) in ( 'N', 'C') THEN 'staff'
WHEN substr(e.employee_class,1,1) = 'F' THEN 'faculty'
ELSE 'other'
END GroupType
FROM isu_general_person igp
JOIN person_detail pd ON igp.person_uid = pd.person_uid
JOIN telephone_current tc ON igp.person_uid = tc.entity_uid
AND tc.phone_number = (
SELECT p.phone_number
FROM telephone_current p
WHERE tc.entity_uid = p.entity_uid
ORDER BY phone_type
FETCH FIRST 1 ROW ONLY
)
LEFT JOIN employee e ON igp.person_uid = e.person_uid
-- LEFT JOIN faculty f ON igp.person_uid = f.person_uid
WHERE 1=1
AND e.employee_status = 'A'
AND substr(e.employee_class,1,1) in ( 'N', 'C', 'F')
AND igp.isu_username IS NOT NULL
;
We did identify problem with the index on the telephone_current table. Once that was resolved, both the versions provided by xQbert worked to provide the single-row result for each individual. The version using WITH BaseData ran in approximately 12 seconds. However, this version returned all rows in 2.4 seconds.
SELECT distinct igp.isu_id PersonnelNumber
, igp.preferred_first_name FirstName
, igp.current_last_name LastName
, NULL Title
, igp.current_mi MiddleInitial
, pd.email_preferred_address
, tc.phone_number_combined
, igp.isu_username networkID
, '0' GroupID
, e.home_organization_desc GroupName
, CASE
WHEN substr(e.employee_class,1,1) in ( 'N', 'C') THEN 'staff'
WHEN substr(e.employee_class,1,1) = 'F' THEN 'faculty'
ELSE 'other'
END GroupType
FROM isu_general_person igp
JOIN person_detail pd
ON igp.person_uid = pd.person_uid
CROSS APPLY (SELECT xtc.phone_number_combined
FROM telephone xtc
WHERE igp.person_uid = xtc.entity_uid
ORDER BY case when phone_type = 'CA' then 1
when phone_Type in ('MA','PR') then 2
else 3 end,
phone_Type,
phone_number_combined
FETCH FIRST 1 ROW ONLY) tc
LEFT JOIN employee e ON igp.person_uid = e.person_uid
-- LEFT JOIN faculty f ON igp.person_uid = f.person_uid
WHERE 1=1
AND e.employee_status = 'A'
AND substr(e.employee_class,1,1) in ( 'N', 'C', 'F')
AND igp.isu_username IS NOT NULL
Example using row_number() analytic and a common table expression. This limits to one phone per person by creating a partition/group of numbers under a given Entity_Uid orders this by a case expression and then assigns row number based on that case expression defined order then phone type, then phone number. The row number is then used to limit the results to just 1 phone number.
WITH BaseData as (
SELECT distinct igp.isu_id PersonnelNumber
, igp.preferred_first_name FirstName
, igp.current_last_name LastName
, NULL Title
, igp.current_mi MiddleInitial
, pd.email_preferred_address
, tc.phone_number_combined
, igp.isu_username networkID
, '0' GroupID
, e.home_organization_desc GroupName
, CASE
WHEN substr(e.employee_class,1,1) in ( 'N', 'C') THEN 'staff'
WHEN substr(e.employee_class,1,1) = 'F' THEN 'faculty'
ELSE 'other'
END GroupType,
row_number() over (PARTITION BY Entity_Uid ORDER BY case when phone_type ='CA' then 1
when phone_Type in ('MA','PR') then 2
else 3 end, phone_Type, Phone_number) RN
FROM isu_general_person igp
JOIN person_detail pd ON igp.person_uid = pd.person_uid
JOIN telephone_current tc ON igp.person_uid = tc.entity_uid
LEFT JOIN employee e ON igp.person_uid = e.person_uid
-- LEFT JOIN faculty f ON igp.person_uid = f.person_uid
WHERE 1=1
AND e.employee_status = 'A'
AND substr(e.employee_class,1,1) in ( 'N', 'C', 'F')
AND igp.isu_username IS NOT NULL)
SELECT *
FROM BaseData
WHERE RN = 1
;
Example As cross apply: Cross apply avoids the need of the analytic and basically says; hey; for each matching igp.person_uid = xtc.entity_uid, get the first record based on the order defined in the subquery. quit when you've got the 1st record for each user
SELECT distinct igp.isu_id PersonnelNumber
, igp.preferred_first_name FirstName
, igp.current_last_name LastName
, NULL Title
, igp.current_mi MiddleInitial
, pd.email_preferred_address
, tc.phone_number_combined
, igp.isu_username networkID
, '0' GroupID
, e.home_organization_desc GroupName
, CASE
WHEN substr(e.employee_class,1,1) in ( 'N', 'C') THEN 'staff'
WHEN substr(e.employee_class,1,1) = 'F' THEN 'faculty'
ELSE 'other'
END GroupType,
FROM isu_general_person igp
JOIN person_detail pd
ON igp.person_uid = pd.person_uid
CROSS APPLY (SELECT xtc.phone_number
FROM telephone_current xtc
WHERE igp.person_uid = xtc.entity_uid
ORDER BY case when phone_type = 'CA' then 1
when phone_Type in ('MA','PR') then 2
else 3 end,
phone_Type,
Telephone_current
FETCH FIRST 1 ROW ONLY) tc
LEFT JOIN employee e ON igp.person_uid = e.person_uid
-- LEFT JOIN faculty f ON igp.person_uid = f.person_uid
WHERE 1=1
AND e.employee_status = 'A'
AND substr(e.employee_class,1,1) in ( 'N', 'C', 'F')
AND igp.isu_username IS NOT NULL

ORA-01652:unable to extend temp segment by 128 in table space TEMP

I tried executing the below query but getting error "ORA-01652: unable to extend temp segment by 128 in tablespace TEMP" in Prod where as it is succefully executing in lower environments.
Other than increasing the TEMP table space , can someone please suggest an alternative?
Thank you for your help.
INSERT /*+ APPEND */ INTO PFE_GP.CONT_DATA(SC_ID,
ID,
PRD,
CONT,
QTY,
PRICE,
PRICE2,
PRICE3,
TOTAL_SALES,
TOTAL_DISCOUNT)
SELECT A.*,
SUM (SALES) OVER (PARTITION BY CONT) AS TOTAL_SALES,
SUM (DISCOUNT) OVER (PARTITION BY CONT) AS TOTAL_DISCOUNT
FROM (
SELECT /*+ FULL(T) PARALLEL(T 8)*/ D.SC_ID,
T.ID,
T.PRD,
R1.CONT,
T.QTY,
T.PRICE,
B.PRICE2,
B.PRICE3,
T.PRICE*T.QTY AS SALES,
T.DISC DISCOUNT
FROM TC T
, BNDL_DFN X
, SOURCE_DATES D
, XREF R1
, PRICE B,
WC_PR W
WHERE D.SOURCE_TABLE = 'CBK'
AND UPPER (X.LEVEL) = 'CONTRACT'
AND X.OFFSET >= 0
AND D.AS_OF_DATE BETWEEN T.EFFECTIVE_DATE AND T.EXPIRATION_DATE
AND TRUNC (T.INV_DATE) BETWEEN X.EFF_DATE AND X.EXP_DATE
AND TRUNC (T.INV_DATE) BETWEEN R1.EFFECTIVE_DATE AND R1.EXPIRATION_DATE
AND T.CON = X.CONT
AND T.PRD = X.PRD
AND T.PRD = W.PRD
AND TRUNC (T.INV_DATE) BETWEEN W.EFFECTIVE_START_DATE and W.EFFECTIVE_END_DATE
AND UPPER(R1.PURP) = 'OTHER'
AND (T.CONT = R1.CONT OR T.PR_GROUP = R1.CONT)
AND T.CONT = B.CONT
AND T.PRD = B.PRD
AND TRUNC(T.INV_DATE) BETWEEN B.DT_START AND B.DT_END
UNION
SELECT /*+ FULL(T) PARALLEL(T 8)*/ D.SC_ID,
T.ID,
T.PRD,
R1.CONT,
T.QTY,
T.PRICE,
B.PRICE2,
B.PRICE3,
T.PRICE*T.QTY AS SALES,
0 DISCOUNT
FROM TC T
, BNDL_DFN X
, SOURCE_DATES D
, XREF R1
, PRICE B,
WC_PR W
WHERE D.SOURCE_TABLE = 'CBK'
AND UPPER (X.LEVEL) = 'CONTRACT'
AND X.OFFSET >= 0
AND D.AS_OF_DATE BETWEEN T.EFFECTIVE_DATE AND T.EXPIRATION_DATE
AND TRUNC (T.INV_DATE) BETWEEN X.EFF_DATE AND X.EXP_DATE
AND TRUNC (T.INV_DATE) BETWEEN R1.EFFECTIVE_DATE AND R1.EXPIRATION_DATE
AND T.PR_GROUP = X.CONT
AND T.PRD = X.PRD
AND T.PRD = W.PRD
AND TRUNC (T.INV_DATE) BETWEEN W.EFFECTIVE_START_DATE and W.EFFECTIVE_END_DATE
AND UPPER(R1.PURP) = 'OTHER'
AND (T.CONT = R1.XREF OR T.PR_GROUP = R1.XREF)
AND T.CONT = B.CONT
AND T.PRD = B.PRD
AND TRUNC(T.INV_DATE) BETWEEN B.DT_START AND B.DT_END
AND T.CUST = TO_CHAR (X.TRAD_CUST)
AND (T.PRICE_GROUP = R1.XREF OR T.CONTRACT = R1.XREF)
) a;
COMMIT;
Use a temporary table to execute both halves of your union separately.
INSERT /*+ APPEND */
INTO tmp_cont_data
( sc_id
, id
, prd
, cont
, qty
, price
, price2
, price3
, total_sales
, total_discount )
SELECT /*+ FULL(T) PARALLEL(T 8)*/
d.sc_id
, t.id
, t.prd
, r1.cont
, t.qty
, t.price
, b.price2
, b.price3
, t.price * t.qty AS sales
, t.disc discount
FROM tc t
, bndl_dfn x
, source_dates d
, xref r1
, price b
, wc_pr w
WHERE d.source_table = 'CBK'
AND UPPER( x.LEVEL ) = 'CONTRACT'
AND x.offset >= 0
AND d.as_of_date BETWEEN t.effective_date AND t.expiration_date
AND TRUNC( t.inv_date ) BETWEEN x.eff_date AND x.exp_date
AND TRUNC( t.inv_date ) BETWEEN r1.effective_date AND r1.expiration_date
AND t.con = x.cont
AND t.prd = x.prd
AND t.prd = w.prd
AND TRUNC( t.inv_date ) BETWEEN w.effective_start_date AND w.effective_end_date
AND UPPER( r1.purp ) = 'OTHER'
AND ( t.cont = r1.cont
OR t.pr_group = r1.cont )
AND t.cont = b.cont
AND t.prd = b.prd
AND TRUNC( t.inv_date ) BETWEEN b.dt_start AND b.dt_end;
INSERT /*+ APPEND */
INTO tmp_cont_data
( sc_id
, id
, prd
, cont
, qty
, price
, price2
, price3
, total_sales
, total_discount )
SELECT /*+ FULL(T) PARALLEL(T 8)*/
d.sc_id
, t.id
, t.prd
, r1.cont
, t.qty
, t.price
, b.price2
, b.price3
, t.price * t.qty AS sales
, 0 discount
FROM tc t
, bndl_dfn x
, source_dates d
, xref r1
, price b
, wc_pr w
WHERE d.source_table = 'CBK'
AND UPPER( x.LEVEL ) = 'CONTRACT'
AND x.offset >= 0
AND d.as_of_date BETWEEN t.effective_date AND t.expiration_date
AND TRUNC( t.inv_date ) BETWEEN x.eff_date AND x.exp_date
AND TRUNC( t.inv_date ) BETWEEN r1.effective_date AND r1.expiration_date
AND t.pr_group = x.cont
AND t.prd = x.prd
AND t.prd = w.prd
AND TRUNC( t.inv_date ) BETWEEN w.effective_start_date AND w.effective_end_date
AND UPPER( r1.purp ) = 'OTHER'
AND ( t.cont = r1.xref
OR t.pr_group = r1.xref )
AND t.cont = b.cont
AND t.prd = b.prd
AND TRUNC( t.inv_date ) BETWEEN b.dt_start AND b.dt_end
AND t.cust = TO_CHAR( x.trad_cust )
AND ( t.price_group = r1.xref
OR t.contract = r1.xref );
INSERT /*+ APPEND */
INTO pfe_gp.cont_data
(
sc_id
, id
, prd
, cont
, qty
, price
, price2
, price3
, total_sales
, total_discount
)
SELECT a.*
, SUM( sales ) OVER (PARTITION BY cont) AS total_sales
, SUM( discount ) OVER (PARTITION BY cont) AS total_discount
FROM tmp_cont_data a;
It takes 3 statements instead of one, but should improve performace.
Look here for TT details: How do you create a temporary table in an Oracle database?

Re-writing a view that uses condition join [OR]

I have this view that is performing really bad, and the problem is its build on joins that use or conditions
I have tried re-writing this for the past 2 days, using CTE and UNION but the result is incorrect
Maybe you guys can help, just few ideas how to do it, because I am completely lost.
CREATE VIEW [dbo].[V_Amazon_Listings]
AS
SELECT
DISTINCT
ISNULL(a.MerchantId, b.MerchantId) AS MerchantId
, ISNULL(a.SKU, b.[seller-sku]) AS SKU
, ISNULL(a.Quantity, 0) AS NONFBAQTY
, ISNULL(b.[Quantity Available], 0) AS FBAQTY
, ISNULL(a.[ASIN], b.[ASIN]) AS [ASIN]
, CASE
WHEN salePrices.SKU IS NOT NULL THEN ISNULL(CONVERT(DECIMAL(18, 2), (salePrices.SalePrice * vamm.USDConversionRate)), 9999)
ELSE ISNULL(CONVERT(DECIMAL(18, 2), (a.Price * vamm.USDConversionRate)), 9999)
END AS Price
, CASE ISNULL(a.Quantity, 0)
WHEN 0 THEN ISNULL(b.[Quantity Available], 0)
ELSE ISNULL(a.Quantity, 0)
END AS Quantity
--a.ReportRequestId,
, CASE WHEN ISNULL(a.Quantity, 0) = 0 AND ISNULL(b.[Quantity Available], 0) > 0 THEN 1
ELSE 0
END AS IsFBAed
, CASE WHEN c.[seller-sku] IS NULL THEN 0
ELSE 1
END AS IsRestricted--,
--a.ReportRequestId RegReportRequestID, b.ReportRequestId FBAReportRequestID, c.ReportRequestID CancReportRequestID
FROM
(
SELECT
ID
, MerchantId
, SKU
, [ASIN]
, Price
, CASE Quantity
WHEN '' THEN 0
ELSE Quantity
END AS Quantity
, ReportRequestId
FROM dbo.Amazon_Listings_Raw WITH (NOLOCK)
WHERE RIGHT(SKU,4) <> '__ON'
AND ReportRequestId IN
(
SELECT
MAX(ReportRequestId)
FROM v_Amazon_Listings_Raw WITH (NOLOCK)
GROUP BY MerchantId
)
) AS a
FULL JOIN
(
SELECT
b.ID
, b.MerchantId
, b.[seller-sku]
, b.[fulfillment-channel-sku]
, b.[ASIN]
, b.[condition-type]
, b.[Warehouse-Condition-code]
, b.[Quantity Available]
, b.ReportRequestId
FROM dbo.[Amazon_FBA_Listings_Raw] AS b WITH (NOLOCK)
WHERE
ReportRequestId IN
(
SELECT
MAX(ReportRequestId)
FROM [v_Amazon_FBA_Listings_Raw] WITH (NOLOCK)
GROUP BY MerchantId
)
AND [Warehouse-Condition-code] = 'Sellable'
) AS b
ON a.MerchantId = b.MerchantId
AND a.SKU = b.[seller-sku]
LEFT JOIN
(
SELECT
ID
, MerchantId
, [item-name]
, [item-description]
, [seller-sku]
, Price
, Quantity
, [image-url]
, [item-is-marketplace]
, [product-id-type]
, [zshop-shipping-fee]
, [item-note]
, [item-condition]
, [zshop-category1]
, [zshop-browse-path]
, [zshop-storefront-feature]
, asin1
, asin2
, asin3
, [will-ship-internationally]
, [expedited-shipping]
, [zshop-boldface]
, [product-id]
, ReportRequestId
FROM dbo.[Amazon_Cancelled_Listings_Raw] WITH (NOLOCK)
WHERE RIGHT([seller-sku],4) <> '__ON'
AND ReportRequestId IN
(
SELECT MAX(ReportRequestId)
FROM [v_Amazon_Cancelled_Listings_Raw] WITH (NOLOCK)
GROUP BY MerchantId
)
) AS c
ON (a.MerchantId = c.MerchantId OR b.MerchantId = c.MerchantId)
AND (b.[seller-sku] = c.[seller-sku] OR a.SKU = c.[seller-sku])
INNER JOIN V_Amazon_Marketplace_Merchants vamm
ON (a.MerchantId = vamm.MerchantID OR b.MerchantId = vamm.MerchantId)
LEFT JOIN AmazonOurPrices salePrices WITH (NOLOCK)
ON vamm.MerchantId = salePrices.MerchantId
AND a.SKU = salePrices.SKU
AND salePrices.SalePrice <> salePrices.RegularPrice
This is what I have come up with, its runs really fast 20 seconds,
but the data is off, not sure what I am missing or doing wrong.
any help would be glady appreciated
CREATE VIEW AMZ_NOTUSED_12_22_2017
as
WITH A ( MerchantID,Sku,Asin,Price,Quantity, [Seller-Sku],SalePrice,USDConversionRate,salePricesSKU,[Quantity Available])
AS
(
SELECT a.MerchantId,a.SKU,a.ASIN,a.Price,a.Quantity, c.[seller-sku],salePrices.SalePrice,vamm.USDConversionRate,salePrices.SKU AS salePricesSKU,'' [Quantity Available]
FROM
(
SELECT
ID
, MerchantId
, SKU
, [ASIN]
, Price
, CASE Quantity
WHEN '' THEN 0
ELSE Quantity
END AS Quantity
, ReportRequestId
FROM dbo.Amazon_Listings_Raw WITH (NOLOCK)
WHERE RIGHT(SKU,4) <> '__ON'
AND ReportRequestId IN
(
SELECT
MAX(ReportRequestId)
FROM v_Amazon_Listings_Raw WITH (NOLOCK)
GROUP BY MerchantId
)
) AS a
LEFT JOIN
(
SELECT
ID
, MerchantId
, [item-name]
, [item-description]
, [seller-sku]
, Price
, Quantity
, [image-url]
, [item-is-marketplace]
, [product-id-type]
, [zshop-shipping-fee]
, [item-note]
, [item-condition]
, [zshop-category1]
, [zshop-browse-path]
, [zshop-storefront-feature]
, asin1
, asin2
, asin3
, [will-ship-internationally]
, [expedited-shipping]
, [zshop-boldface]
, [product-id]
, ReportRequestId
FROM dbo.[Amazon_Cancelled_Listings_Raw] WITH (NOLOCK)
WHERE RIGHT([seller-sku],4) <> '__ON'
AND ReportRequestId IN
(
SELECT MAX(ReportRequestId)
FROM [v_Amazon_Cancelled_Listings_Raw] WITH (NOLOCK)
GROUP BY MerchantId
)
) AS c
ON a.MerchantId = c.MerchantId and a.SKU = c.[seller-sku]
INNER JOIN V_Amazon_Marketplace_Merchants vamm
ON a.MerchantId = vamm.MerchantID
LEFT JOIN AmazonOurPrices salePrices WITH (NOLOCK)
ON vamm.MerchantId = salePrices.MerchantId
AND a.SKU = salePrices.SKU
AND salePrices.SalePrice <> salePrices.RegularPrice
)
,
B AS
(
SELECT b.MerchantId,
b.[Seller-SKU] AS Sku,
b.ASIN,
c.Price,
c.quantity ,
c.[seller-sku],
salePrices.SalePrice,
vamm.USDConversionRate,
salePrices.SKU AS salePricesSKU,
b.[Quantity Available]
FROM
(
SELECT
b.ID
, b.MerchantId
, b.[seller-sku]
, b.[fulfillment-channel-sku]
, b.[ASIN]
, b.[condition-type]
, b.[Warehouse-Condition-code]
, b.[Quantity Available]
, b.ReportRequestId
FROM dbo.[Amazon_FBA_Listings_Raw] AS b WITH (NOLOCK)
WHERE
ReportRequestId IN
(
SELECT
MAX(ReportRequestId)
FROM [v_Amazon_FBA_Listings_Raw] WITH (NOLOCK)
GROUP BY MerchantId
)
AND [Warehouse-Condition-code] = 'Sellable'
) AS b
LEFT JOIN
(
SELECT
ID
, MerchantId
, [item-name]
, [item-description]
, [seller-sku]
, Price
, Quantity
, [image-url]
, [item-is-marketplace]
, [product-id-type]
, [zshop-shipping-fee]
, [item-note]
, [item-condition]
, [zshop-category1]
, [zshop-browse-path]
, [zshop-storefront-feature]
, asin1
, asin2
, asin3
, [will-ship-internationally]
, [expedited-shipping]
, [zshop-boldface]
, [product-id]
, ReportRequestId
FROM dbo.[Amazon_Cancelled_Listings_Raw] WITH (NOLOCK)
WHERE RIGHT([seller-sku],4) <> '__ON'
AND ReportRequestId IN
(
SELECT MAX(ReportRequestId)
FROM [v_Amazon_Cancelled_Listings_Raw] WITH (NOLOCK)
GROUP BY MerchantId
)
) AS c
ON
b.[seller-sku] = c.[seller-sku] and b.MerchantId = c.MerchantId
INNER JOIN V_Amazon_Marketplace_Merchants vamm
ON b.MerchantId = vamm.MerchantId
LEFT JOIN AmazonOurPrices salePrices WITH(NOLOCK)
ON vamm.MerchantId = salePrices.MerchantId
AND b.[Seller-SKU] = salePrices.SKU
AND salePrices.SalePrice <> salePrices.RegularPrice
)
SELECT
DISTINCT
a.MerchantId AS MerchantId
, a.SKU
, a.Quantity AS NONFBAQTY
, ISNULL(a.[Quantity Available],0) AS FBAQTY
, a.[ASIN] AS [ASIN]
, CASE
WHEN salePricesSKU IS NOT NULL THEN ISNULL(CONVERT(DECIMAL(18, 2), (SalePrice * USDConversionRate)), 9999)
ELSE ISNULL(CONVERT(DECIMAL(18, 2), (a.Price * USDConversionRate)), 9999)
END AS Price
, CASE ISNULL(a.Quantity, 0)
WHEN 0 THEN ISNULL(a.[Quantity Available], 0)
ELSE ISNULL(a.Quantity, 0)
END AS Quantity
, CASE WHEN ISNULL(a.Quantity, 0) = 0 AND ISNULL(a.[Quantity Available], 0) > 0 THEN 1
ELSE 0
END AS IsFBAed
, CASE WHEN a.[seller-sku] IS NULL THEN 0
ELSE 1
END AS IsRestricted
FROM
(
SELECT * FROM a
UNION
SELECT * FRom B
) a

What is the meaning of "unable to open iterator for an alias" in pig?

I was trying to use the union operator like as show below
uni_b = UNION A, B, C, D, E, F, G, H;
here all the relations A,B,C...H are having same schema
when ever I am using the dump operator, till 85% it running fine.. after that it is showing the following error..
ERROR 1066: Unable to open iterator for alias uni_b
what is this? where is the problem? how should I debug?
this is my pig script...
ip = load '/jee/jee_data.txt' USING PigStorage(',') as (id:Biginteger, fname:chararray , lname:chararray , board:chararray , eid:chararray , gender:chararray , math:double , phy:double , chem:double , jeem:double , jeep:double , jeec:double ,cat:chararray , dob:chararray);
todate_ip = foreach ip generate id, fname , lname , board , eid , gender , math , phy , chem , jeem , jeep , jeec , cat , ToDate(dob,'dd/MM/yyyy') as dob;
jnbresult1 = foreach todate_ip generate id, fname , lname , board , eid , gender , math , phy , chem , jeem , jeep , jeec, ROUND_TO(AVG(TOBAG( math , phy , chem )),3) as bresult, ROUND_TO(SUM(TOBAG(jeem , jeep , jeec )),3) as jresult , cat , dob;
rankjnbres = rank jnbresult1 by jresult DESC , bresult DESC , jeem DESC, math DESC, jeep DESC, phy DESC, jeec DESC, chem DESC, gender ASC, dob ASC, fname ASC, lname ASC DENSE;
rankjnbres1 = rank jnbresult1 by bresult DESC , jeem DESC, math DESC, jeep DESC, phy DESC, jeec DESC, chem DESC, gender ASC, dob ASC, fname ASC, lname ASC DENSE;
allper = foreach rankjnbres generate id, rank_jnbresult1 , fname , lname , board , eid , gender , math , phy , chem , jeem , jeep , jeec, bresult, jresult , cat , dob , ROUND_TO(((double)((10000-rank_jnbresult1)/100.000)),3) as aper;
allper1 = foreach rankjnbres1 generate id, rank_jnbresult1 , fname , lname , board , eid , gender , math , phy , chem , jeem , jeep , jeec, bresult, jresult , cat , dob , ROUND_TO(((double)((10000-rank_jnbresult1)/100.000)),3) as a1per;
SPLIT allper into cbseB if board=='CBSE', anbB if board=='Andhra Pradesh', apB if board=='Arunachal Pradesh', bhB if board=='Bihar', gjB if board=='Gujarat' , jnkB if board=='Jammu and Kashmir', mpB if board=='Madhya Pradesh', mhB if board=='Maharashtra', rjB if board=='Rajasthan' , ngB if board=='Nagaland' , tnB if board=='Tamil Nadu' , wbB if board=='West Bengal' , upB if board=='Uttar Pradesh';
rankcbseB = rank cbseB by jresult DESC , bresult DESC , jeem DESC, math DESC, jeep DESC, phy DESC, jeec DESC, chem DESC, gender ASC, dob ASC, fname ASC, lname ASC DENSE;
grp = group rankcbseB all;
maxno = foreach grp generate MAX(rankcbseB.rank_cbseB) as max1;
cbseper = foreach rankcbseB generate id, rank_cbseB , fname , lname , board , eid , gender , math , phy , chem , jeem , jeep , jeec, bresult, jresult , cat , dob , ROUND_TO(((double)((maxno.max1-rank_cbseB)*100.000/maxno.max1)),3) as per , aper;
rankBcbseB = rank cbseB by bresult DESC , jeem DESC, math DESC, jeep DESC, phy DESC, jeec DESC, chem DESC, gender ASC, dob ASC, fname ASC, lname ASC DENSE;
grp = group rankBcbseB all;
maxno = foreach grp generate MAX(rankBcbseB.rank_cbseB) as max1;
Bcbseper = foreach rankBcbseB generate id, rank_cbseB , fname , lname , board , eid , gender , math , phy , chem , jeem , jeep , jeec, bresult, jresult , cat , dob , ROUND_TO(((double)((maxno.max1-rank_cbseB)*100.000/maxno.max1)),3) as bper , aper;
rankanbB = rank anbB by jresult DESC , bresult DESC , jeem DESC, math DESC, jeep DESC, phy DESC, jeec DESC, chem DESC, gender ASC, dob ASC, fname ASC, lname ASC DENSE;
grp = group rankanbB all;
maxno = foreach grp generate MAX(rankanbB.rank_anbB) as max1;
anbper = foreach rankanbB generate id, rank_anbB , fname , lname , board , eid , gender , math , phy , chem , jeem , jeep , jeec, bresult,jresult , cat , dob , ROUND_TO(((double)((maxno.max1-rank_anbB)*100.000/maxno.max1)),3) as per , aper;
rankBanbB = rank anbB by bresult DESC , jeem DESC, math DESC, jeep DESC, phy DESC, jeec DESC, chem DESC, gender ASC, dob ASC, fname ASC, lname ASC DENSE;
grp = group rankBanbB all;
maxno = foreach grp generate MAX(rankBanbB.rank_anbB) as max1;
Banbper = foreach rankanbB generate id, rank_anbB , fname , lname , board , eid , gender , math , phy , chem , jeem , jeep , jeec, bresult, jresult , cat , dob , ROUND_TO(((double)((maxno.max1-rank_anbB)*100.000/maxno.max1)),3) as bper , aper;
joinall = join cbseper by (per) , Bcbseper by (bper) ;
joinall = foreach joinall generate Bcbseper::id as id,cbseper::jresult as b1;
A = cross Bcbseper , allper;
A1 = foreach A generate Bcbseper::id as id,Bcbseper::rank_cbseB as rank,Bcbseper::fname as fname,Bcbseper::lname as lname,Bcbseper::board as board,Bcbseper::eid as eid ,Bcbseper::gender as gender, Bcbseper::bresult as bresult,Bcbseper::jresult as jresult,Bcbseper::cat as cat,Bcbseper::dob as dob,Bcbseper::bper as bper,Bcbseper::aper as aper,allper::jresult as b2,allper::aper as a1per;
B = filter A1 by bper > a1per;
C = group B by id;
Dcbse = foreach C {
E = order B by a1per DESC;
F = limit E 1;
generate FLATTEN(F.id) , FLATTEN(F.b2);
};
joincbse = join joinall by id , Dcbse by id;
joincbse = foreach joincbse generate joinall::id as id , joinall::b1 as b1, Dcbse::null::b2 as b2;
joinall = join anbper by (per) , Banbper by (bper) ;
joinall = foreach joinall generate Banbper::id as id,anbper::jresult as b1;
A = cross Banbper , allper;
A1 = foreach A generate Banbper::id as id,Banbper::rank_anbB as rank,Banbper::fname as fname,Banbper::lname as lname,Banbper::board as board,Banbper::eid as eid ,Banbper::gender as gender, Banbper::bresult as bresult,Banbper::jresult as jresult,Banbper::cat as cat,Banbper::dob as dob,Banbper::bper as bper,Banbper::aper as aper,allper::jresult as b2,allper::aper as a1per;
B = filter A1 by bper > a1per;
C = group B by id;
Danb = foreach C {
E = order B by a1per DESC;
F = limit E 1;
generate FLATTEN(F.id) , FLATTEN(F.b2);
};
joinanb = join joinall by id , Danb by id;
joinanb = foreach joinanb generate joinall::id as id , joinall::b1 as b1, Danb::null::b2 as b2;
uni_b = UNION joincbse , joinanb ;
I got the solution to this. What is did was the following...
First i stored all the relations A, B, C, .... using store operation as follows
STORE A into into '/opA/' using PigStorage(',');
Then, I Loaded the input for all the relation using load operation as follows
ipA = load '/opA/part-r-00000' USING PigStorage (',') as (id:Biginteger, b1: double, b2: double);
And at last I did the union using the union operation as follows
uni_b = UNION ipA ,ipB ,ipC , ipD ,ipE ;
I got the answer without any error.

Way to speed up an insert with left outer join

I am trying to figure out a way to speed up this query...as is it takes about 40 minutes.
Inserts new rows not existing in a table of a linked server.
INSERT INTO [remote.server.com].[DB].dbo.Table1( Id , Barcode , Name , Address , Address2 , City , State , Zip , Date , Text1 , Text2 , Text3 , Text4 , Text5 , Text6 , Text7 , Text8 , Text9 , Text10 )
SELECT s.Id , s.Barcode , s.Name , s.Address , s.Address2 , s.City , s.State , s.Zip , s.Date , s.Text1 , s.Text2 , s.Text3 , s.Text4 , s.Text5 , s.Text6 , s.Text7 , s.Text8 , s.Text9 , s.Text10
FROM LocalTable1 AS s LEFT OUTER JOIN [remote.server.com].[DB].dbo.Table1 AS d ON s.Id = d.Id AND s.Barcode = d.Barcode
WHERE d.Id IS NULL;
Any ideas? Thanks for the help.

Resources