N1ql count distinct associated documents

N1ql count distinct associated documents - performance

I have a single bucket (Couchbase Community edition 6.5) consisting of the following documents:
employee {
type: "Employee"
}
X {
type: "X",
employeeId: string,
date: string
}
Y {
type: "Y",
employeeId: string,
date: string
}
Z {
type: "Z",
employeeId: string,
date: string
}
I need to get the total number of documents (X,Y,Z) that are associated with each employee between two dates.
I have written the following query that works but with slow execution time:
CREATE INDEX `index_X` ON `bucket`(`type`,`date`, `employeeId`)
WHERE type = "X"
CREATE INDEX `index_Y` ON `bucket`(`type`,`date`, `employeeId`)
WHERE type = "Y"
CREATE INDEX `index_Z` ON `bucket`(`type`,`date`, `employeeId`)
WHERE type = "Z"
SELECT META(employee).id,
x.totalX,
y.totalY,
z.totalZ,
FROM `bucket` employee
LEFT JOIN (
SELECT obj.employeeId,
COUNT(obj.employeeId) AS totalX
FROM `bucket` obj
WHERE obj.type = "X"
AND obj.date BETWEEN "startDate" AND "endDate"
GROUP BY obj.employeeId) x ON x.employeeId = META(employee).id
LEFT JOIN (
SELECT obj.employeeId,
COUNT(obj.employeeId) AS totalY
FROM `bucket` obj
WHERE obj.type = "Y"
AND obj.date BETWEEN "startDate" AND "endDate"
GROUP BY obj.employeeId) y ON y.employeeId = META(employee).id
LEFT JOIN (
SELECT obj.employeeId,
COUNT(obj.employeeId) AS totalZ
FROM `bucket` obj
WHERE obj.type = "Z"
AND obj.date BETWEEN "startDate" AND "endDate"
GROUP BY obj.employeeId) z ON z.employeeId = META(employee).id
WHERE employee.type = "Employee"
I also tried the following but this query times out completely!
CREATE INDEX `index_X` ON `bucket`(`type`,`date`, `employeeId`)
WHERE type = "X"
CREATE INDEX `index_Y` ON `bucket`(`type`,`date`, `employeeId`)
WHERE type = "Y"
CREATE INDEX `index_Z` ON `bucket`(`type`,`date`, `employeeId`)
WHERE type = "Z"
SELECT META(employee).id,
COUNT(x.employeeId),
COUNT(y.employeeId),
COUNT(z.employeeId)
FROM `bucket` employee
LEFT JOIN `bucket` x ON x.employeeId = META(employee).id
AND x.type = "X"
AND x.date BETWEEN "startDate" AND "endDate"
LEFT JOIN `bucket` y ON y.employeeId = META(employee).id
AND y.type = "Y"
AND y.date BETWEEN "startDate" AND "endDate"
LEFT JOIN `bucket` z ON z.employeeId = META(employee).id
AND z.type = "Z"
AND z.date BETWEEN "startDate" AND "endDate"
WHERE employee.type = "Employee"
GROUP BY META(employee).id
Can anyone please advise on a more optimal route to follow?
With both queries I can see that my indexes are being used but I can also see in my query plan that for every join a "NestedLoopJoin" is being chained to the previous one. Could this possibly be the problem?
I am still new to writing n1ql queries and trying to figure out the most efficient methods so any advice will be welcome.

CREATE INDEX `index_1` ON `agrigistics_dev`(`employeeId`, type, `date`) WHERE type IN ["X", "Y", "Z"];
SELECT META(employee).id,
SUM(CASE x.type = "X" THEN 1 ELSE 0 END) xcount,
SUM(CASE x.type = "Y" THEN 1 ELSE 0 END) ycount,
SUM(CASE x.type = "Z" THEN 1 ELSE 0 END) zcount
FROM `bucket` employee
LEFT JOIN `bucket` x
ON x.employeeId = META(employee).id AND x.type IN ["X", "Y", "Z"] AND x.date BETWEEN "startDate" AND "endDate"
WHERE employee.type = "Employee"
GROUP BY META(employee).id;
To avoid case-cade nested-loop join or explosion of join use CTE(6.5) in this case
CREATE INDEX `index_1` ON `agrigistics_dev`(type, date, `employeeId`) WHERE type IN ["X", "Y", "Z"];
WITH etype AS (SELECT x.employeeId,
SUM(CASE x.type = "X" THEN 1 ELSE 0 END) xcount,
SUM(CASE x.type = "Y" THEN 1 ELSE 0 END) ycount,
SUM(CASE x.type = "Z" THEN 1 ELSE 0 END) zcount
FROM `bucket` x
WHERE x.type IN ["X", "Y", "Z"] AND x.date BETWEEN "startDate" AND "endDate"
GROUP BY x.employeeId)
SELECT META(employee).id,
SUM(y.xcount) xcount,
SUM(y.ycount) ycount,
SUM(y.zcount) zcount
FROM `bucket` AS employee
LEFT JOIN etype AS y ON y.employeeId = META(employee).id
WHERE employee.type = "Employee"
GROUP BY META(employee).id;
https://index-advisor.couchbase.com/indexadvisor/#1
https://blog.couchbase.com/create-right-index-get-right-performance/

Related

Oracle Pivot options

I have the below Pivot and output. I would like to display the below.
Remove the parentheses around the columns?
Add indicator of X and Null in substitute of 1 and 0?
SQL:
SELECT DISTINCT
*
FROM (
SELECT D.ID, D.DI, A.ID
FROM A
LEFT JOIN AD ON A.ID = AD.ID
LEFT JOIN D ON AD.ID = D.ID
WHERE 1=1
AND A.ID = 890929
)
PIVOT
(
COUNT(ID)
FOR DI IN ( 'Low med','Soft','Regular','High Med','Other')
)
Query output:
ID 'Low med' 'Soft' 'Regular' 'High Med' 'Other'
1 1 1 0 0 1
Expected output:
ID LOW_MED SOFT REGULAR HIGH_MED OTHER
1 X X NULL NULL X

You can remove the single quotes (not parentheses, which are ()), by aliasing the pivoted expressions:
FOR DI IN ('Low med' as low_med, 'Soft' as soft, 'Regular' as regular,
'High Med' as high_med,'Other' as other)
You can then use those aliases for the second part, but adding case expressions to your main query:
SELECT id,
case when low_med = 1 then 'X' else null end as low_med,
case when soft = 1 then 'X' else null end as soft,
case when regular = 1 then 'X' else null end as regular,
case when high_med = 1 then 'X' else null end as high_med,
case when other = 1 then 'X' else null end as other
FROM (
SELECT D.ID, D.DI, A.ID
FROM A
LEFT JOIN AD ON A.ID = AD.ID
LEFT JOIN D ON AD.ID = D.ID
WHERE 1=1
AND A.ID = 890929
)
PIVOT
(
COUNT(ID)
FOR DI IN ('Low med' as low_med, 'Soft' as soft, 'Regular' as regular,
'High Med' as high_med,'Other' as other)
)

Get "group by" data other than the Key element

I'm trying to get the sum of fees per customer per month from the mysql sakila database.
My SQL query looks like this:
select first_name, last_name, MONTHNAME(payment_date) as Month, sum(amount) as FeeSum
from customer c
join payment p on c.customer_id = p.customer_id
where (payment_date between '2005-01-01' AND '2005-06-30')
group by c.customer_id, Month
order by Month desc, FeeSum desc;
I did this in linqpad
var q11 = from c in Customer
from p in Payment
where c.Customer_id == p.Customer_id && ((DateTime)p.Payment_date) > DateTime.Parse("2005-01-01") && ((DateTime)p.Payment_date) < DateTime.Parse("2005-06-30")
group new {c, p} by new {((DateTime)p.Payment_date).Month, p.Customer_id} into grp
select new {
Month = grp.Key.Month,
FeeSum = grp.Sum(s => s.p.Amount),
} into selection
orderby selection.Month, selection.FeeSum descending
select selection;
q11.Dump();
Which works for the FeeSum and the Month, but I can't figure out how to get the first_name and the lasT_name of the customer

Since the members of each grp all have the same Customer, just pick any one:
var q11 = from c in Customer
from p in Payment
where c.Customer_id == p.Customer_id && ((DateTime)p.Payment_date) > DateTime.Parse("2005-01-01") && ((DateTime)p.Payment_date) < DateTime.Parse("2005-06-30")
group new { c, p } by new { ((DateTime)p.Payment_date).Month, p.Customer_id } into grp
select new {
grp.First().first_name,
grp.First().last_name,
Month = grp.Key.Month,
FeeSum = grp.Sum(s => s.p.Amount),
} into selection
orderby selection.Month, selection.FeeSum descending
select selection;
Note: I prefer to use let rather than select twice, but it is the same thing internally, I believe:
var q11 = from c in Customer
from p in Payment
where c.Customer_id == p.Customer_id && ((DateTime)p.Payment_date) > DateTime.Parse("2005-01-01") && ((DateTime)p.Payment_date) < DateTime.Parse("2005-06-30")
group new { c, p } by new { ((DateTime)p.Payment_date).Month, p.Customer_id } into grp
let FeeSum = grp.Sum(s => s.p.Amount)
orderby grp.Key.Month, FeeSum descending
select new {
grp.First().first_name,
grp.First().last_name,
Month = grp.Key.Month,
FeeSum
};

Oracle/SQL Query issue

I am sort of a newbie to oracle/sql. I am trying to pull from the same column different values and add them with some other information. The other information is not the issue it is trying to count and add here the problem comes in.
I am connecting to an oracle database. Here is what i have
SELECT
EV.PUBLIC_DESCRIPTION,
EV.EVENT_DATE,
ES.PRICE,
BT.BUYER_TYPE_CODE,
PCA.ADDR1,
PCA.ADDR2,
PCA.CITY,
PCA.POSTAL_CODE,
PCE.EMAIL,
PC.FORMATTED_NAME,
PCP.PHONE_NUMBER,
PCP.SECONDARY,
SUM(COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRADLT' THEN 1 ELSE 0 END) + COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRADTE' THEN 1 ELSE 0 END) + COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRSTND' THEN 1 ELSE 0 END) + COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GSTDTE' THEN 1 ELSE 0 END) + COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GROUDI' THEN 1 ELSE 0 END)) AS "Adults",
SUM(COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRCHILD' THEN 1 ELSE 0 END) + COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRCHTE' THEN 1 ELSE 0 END)) AS 'Paid Child',
SUM(COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRPCH' THEN 1 ELSE 0 END)) AS 'Free Child',
SUM(COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRCOMP' THEN 1 ELSE 0 END)) AS 'Comps'
FROM EVENT EV
INNER JOIN EVENT_SEAT ES ON EV.EVENT_ID = ES.EVENT_ID
INNER JOIN BUYER_TYPE BT ON ES.BUYER_TYPE_ID = BT.BUYER_TYPE_ID
INNER JOIN PATRON_ORDER PO ON ES.ORDER_ID = PO.ORDER_ID
INNER JOIN PATRON_ACCOUNT PA ON ES.ATTENDING_PATRON_ACCOUNT_ID = PA.PATRON_ACCOUNT_ID
INNER JOIN PATRON_CONTACT PC ON PA.PATRON_ACCOUNT_ID = PC.PATRON_ACCOUNT_ID
INNER JOIN PATRON_CONTACT_ADDRESS PCA ON PC.PATRON_ACCOUNT_ID = PCA.PATRON_ACCOUNT_ID
INNER JOIN PATRON_CONTACT_EMAIL PCE ON PCA.PATRON_ACCOUNT_ID = PCE.PATRON_ACCOUNT_ID
INNER JOIN PATRON_CONTACT_PHONE PCP ON PCE.PATRON_ACCOUNT_ID = PCP.PATRON_ACCOUNT_ID
GROUP BY EV.PUBLIC_DESCRIPTION, EV.EVENT_DATE
ORDER BY ES.TRANSACTION_ID DESC, PCP.SECONDARY DESC, PCP.PHONE_NUMBER DESC, PC.FORMATTED_NAME DESC, PCE.EMAIL DESC, PCA.POSTAL_CODE DESC, PCA.CITY DESC, PCA.ADDR2 DESC, PCA.ADDR1 DESC, BT.BUYER_TYPE_CODE DESC, ES.PRICE DESC;
any help would be greatly appreciated

You need to decide which technique you want to use, currently you are using 2 techniques and they are colliding.
For this you must know: COUNT() will increment by one for every NON-NULL value
So, to use COUNT() with a case expression do this
COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRSTND' THEN 1 END)
or
COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRSTND' THEN 1 ELSE NULL END)
OR, don't use COUNT(), use SUM() instead
SUM(CASE WHEN BT.BUYER_TYPE_CODE = 'GRSTND' THEN 1 ELSE 0 END)
To add conditions together, I suggest you use the case expression better
Instead of something like this:
, COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRADLT' THEN 1 END)
+ COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRADTE' THEN 1 END)
+ COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRSTND' THEN 1 END)
+ COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GSTDTE' THEN 1 END)
+ COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GROUDI' THEN 1 END) AS "Adults"
Use this:
COUNT(CASE WHEN BT.BUYER_TYPE_CODE IN ('GRADLT','GRADTE','GRSTND','GSTDTE','GROUDI') THEN 1 ELSE NULL END)
There is also an issue with your GROUP BY, which MUST contain ALL non-aggregating columns. I think your query should look more like this:
SELECT
EV.PUBLIC_DESCRIPTION
, EV.EVENT_DATE
, ES.PRICE
/* , BT.BUYER_TYPE_CODE */
, PCA.ADDR1
, PCA.ADDR2
, PCA.CITY
, PCA.POSTAL_CODE
, PCE.EMAIL
, PC.FORMATTED_NAME
, PCP.PHONE_NUMBER
, PCP.SECONDARY
, COUNT(CASE WHEN BT.BUYER_TYPE_CODE IN ('GRADLT', 'GRADTE', 'GRSTND', 'GSTDTE', 'GROUDI') THEN 1 ELSE NULL END) AS "Adults"
, COUNT(CASE WHEN BT.BUYER_TYPE_CODE IN ('GRCHILD', 'GRCHTE', 'GRPCH', 'GRCOMP') THEN 1 ELSE NULL END) AS "Free Child"
, COUNT(CASE WHEN BT.BUYER_TYPE_CODE = 'GRCOMP' THEN 1 ELSE NULL END) AS "Comps"
FROM EVENT EV
INNER JOIN EVENT_SEAT ES
ON EV.EVENT_ID = ES.EVENT_ID
INNER JOIN BUYER_TYPE BT
ON ES.BUYER_TYPE_ID = BT.BUYER_TYPE_ID
INNER JOIN PATRON_ORDER PO
ON ES.ORDER_ID = PO.ORDER_ID
INNER JOIN PATRON_ACCOUNT PA
ON ES.ATTENDING_PATRON_ACCOUNT_ID = PA.PATRON_ACCOUNT_ID
INNER JOIN PATRON_CONTACT PC
ON PA.PATRON_ACCOUNT_ID = PC.PATRON_ACCOUNT_ID
INNER JOIN PATRON_CONTACT_ADDRESS PCA
ON PC.PATRON_ACCOUNT_ID = PCA.PATRON_ACCOUNT_ID
INNER JOIN PATRON_CONTACT_EMAIL PCE
ON PCA.PATRON_ACCOUNT_ID = PCE.PATRON_ACCOUNT_ID
INNER JOIN PATRON_CONTACT_PHONE PCP
ON PCE.PATRON_ACCOUNT_ID = PCP.PATRON_ACCOUNT_ID
GROUP BY
EV.PUBLIC_DESCRIPTION
, EV.EVENT_DATE
, ES.PRICE
/* , BT.BUYER_TYPE_CODE */
, PCA.ADDR1
, PCA.ADDR2
, PCA.CITY
, PCA.POSTAL_CODE
, PCE.EMAIL
, PC.FORMATTED_NAME
, PCP.PHONE_NUMBER
, PCP.SECONDARY
/* check all these columns exist in the select clause
ORDER BY
ES.TRANSACTION_ID DESC
, PCP.SECONDARY DESC
, PCP.PHONE_NUMBER DESC
, PC.FORMATTED_NAME DESC
, PCE.EMAIL DESC
, PCA.POSTAL_CODE DESC
, PCA.CITY DESC
, PCA.ADDR2 DESC
, PCA.ADDR1 DESC
, BT.BUYER_TYPE_CODE DESC
, ES.PRICE DESC
*/
When you come the the final clause: ORDER BY you can ONLY reference columns that exist in the select clause. This example would FAIL
select column1 from table1 group by column1 order by fred
but this would work:
select column1 from table1 group by column1 order by column1

Your column aliases 'Paid Child', 'Free Child', 'Comps' should not be wrapped in single quotes. You should be using double quotes like you are already for "Adults".
So they should instead be:
"Paid Child"
"Free Child"
"Comps"
Or better yet, consider naming your aliases without any spaces, so you don't have to worry about wrapping the aliases in anything, like this:
paid_child
free_child
comps
Documentation on Database Object Names and Qualifiers:
Database Object Naming Rules
Every database object has a name. In a SQL statement, you represent the name of an object with a quoted identifier or a nonquoted identifier.
A quoted identifier begins and ends with double quotation marks ("). If you name a schema object using a quoted identifier, then you must use the double quotation marks whenever you refer to that object.
A nonquoted identifier is not surrounded by any punctuation.
...
Although column aliases, table aliases, usernames, and passwords are not objects or parts of objects, they must also follow these naming rules unless otherwise specified in the rules themselves.

Dynamic query to dynamic data

I am new to the oracle database, I am trying to execute the following query
select o.id as ovaid ,
(case when(select count(m.cid) from ovamapper m where m.id = o.id and m.solutionid = 1)>0 then 1 else 0 end) as sol1,
(case when(select count(m.cid) from ovamapper m where m.id = o.id and m.solutionid = 2)>0 then 1 else 0 end) as sol1,
(case when(select count(m.cid) from ovamapper m where m.id = o.id and m.solutionid = 3)>0 then 1 else 0 end) as sol1 from ovatemplate o order by o.id
Instead of static values for solutionid , I would like to select it from other table.
Any help on this is really appreciated

you could use
join
to table that contain the solutionid. ex
Select * from ovatemplate JOIN solutiontable ON (solutiontable.ovaid=ovatempate.ovaid)
after that, change the static values to solutionid

Try this query
select o.id as ovaid ,
count(case when solutionid = 1 then m.cid else null end) as sol1 ,
count(case when solutionid = 2 then m.cid else null end) as sol2 ,
count(case when solutionid = 3 then m.cid else null end) as sol3
from ovamapper m , ovatemplate o
where m.id = o.id
group by o.id
order by o.id
If you dont need the aggregations as columns you should probably do that instead
select o.id as ovaid , solutionid , count(*) as sol
from ovamapper m , ovatemplate o
where m.id = o.id
and m.solutionid in (1,2,3)
group by o.id , solutionid
order by o.id

LINQ with sub select query using sum and group by?

Can anyone help me with a LINQ query. I have converted most of it but i have a sub query in the stored procedure and i can't figure out how to do it..
basically this is the old stored procedure (truncated for ease)
SELECT M.Period AS 'Period' ,
C.Code AS 'Group' ,
C.ClientCode AS 'Code' ,
C.ClientName AS 'Name' ,
( SELECT SUM(Amount) AS Expr1
FROM M
WHERE ( ClientCode = C.ClientCode )
GROUP BY ClientCode
) AS 'Amount' ,
As you can see from above the sub query is like so
SELECT SUM(Amount) AS Expr1
FROM M
WHERE ( ClientCode = C.ClientCode )
GROUP BY ClientCode
) AS 'Amount'
So i have done all my joins and i have this so far and it works.
var test = from c in C join h in H on c.Code
equals h.Code join m in M on c.ClientCode
equals m.ClientCode
select new
{
Period=m.Period,
Group=c.Code,
Code= c.ClientCode,
Name= c.ClientName,
<-- Here is where i need the sub select query above -->
};
But i am at a loss of how to do the subquery. The name of the column will be Amount as you are able to see in the old stored procedure.
I would appreciate any feedback or help
THanks

Im not sure on that last part of your SQL query but I am assuming something like this
SELECT M.Period AS 'Period' ,
C.Code AS 'Group' ,
C.ClientCode AS 'Code' ,
C.ClientName AS 'Name' ,
( SELECT SUM(Amount) AS Expr1
FROM M
WHERE ( ClientCode = C.ClientCode )
GROUP BY ClientCode
) AS 'Amount'
from C inner join M on C.ClientCode = M.ClientCode
so your LINQ will be this
var test = from c in db.C
select new {
Period = c.M.Period,
Group = c.Code,
Code = c.ClientCode,
Name = c.ClientName,
Amount = (System.Int32)
((from m0 in db.M
where
m0.ClientCode == c.ClientCode
group m0 by new {
m0.ClientCode
} into g
select new {
Expr1 = (System.Int32)g.Sum(p => p.Amount)
}).First().Expr1)
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

N1ql count distinct associated documents - performance

Related

Oracle Pivot options

Get "group by" data other than the Key element

Oracle/SQL Query issue

Dynamic query to dynamic data

LINQ with sub select query using sum and group by?

Categories

Resources