how h2 chooses right/wrong index in Join - performance

I had an issues with a named query in Java, but the issue was that the problem was in H2.
I thought ANALYZE was my Solution and would solve my problem. It did locally on my dev machine. On client side it did made it worse.
Scenario:
I have an H2 Database with data version 105. After importing some more data it becomes version 106.
The Table looks like
The Query (get the rows with given guid, local and highest version):
SELECT tdo.TECDOC_GUID as guid, tdo.TECDOC_LOCALE as locale , tdo.TECDOC_VERSION as version, tdo.DATA as data
FROM TECDOC_OBJECTS tdo
LEFT OUTER JOIN TECDOC_OBJECTS tdo1
ON (
tdo.TECDOC_GUID = tdo1.TECDOC_GUID AND
tdo.TECDOC_LOCALE = tdo1.TECDOC_LOCALE AND
tdo.TECDOC_VERSION < tdo1.TECDOC_VERSION)
WHERE tdo1.id IS NULL
AND tdo.TECDOC_GUID in ('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0')
AND tdo.TECDOC_LOCALE = 'de';
Before I ran ANALYZE command the execution plan (scanCount really low):
SELECT
TDO.TECDOC_GUID AS GUID,
TDO.TECDOC_LOCALE AS LOCALE,
TDO.TECDOC_VERSION AS VERSION,
TDO.DATA AS DATA
FROM PUBLIC.TECDOC_OBJECTS TDO
/* PUBLIC.IDX_TECDOC_GUID: TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0') */
/* WHERE (TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
AND (TDO.TECDOC_LOCALE = 'de')
*/
/* scanCount: 19 */
LEFT OUTER JOIN PUBLIC.TECDOC_OBJECTS TDO1
/* PUBLIC.IDX_GUID_LOCALE_VERSION: TECDOC_GUID = TDO.TECDOC_GUID
AND TECDOC_LOCALE = TDO.TECDOC_LOCALE
AND TECDOC_VERSION > TDO.TECDOC_VERSION
*/
ON (TDO.TECDOC_VERSION < TDO1.TECDOC_VERSION)
AND ((TDO.TECDOC_GUID = TDO1.TECDOC_GUID)
AND (TDO.TECDOC_LOCALE = TDO1.TECDOC_LOCALE))
/* scanCount: 4 */
WHERE (TDO.TECDOC_LOCALE = 'de')
AND ((TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
AND (TDO1.ID IS NULL))
/*
total: 37
TECDOC_OBJECTS.IDX_GUID_LOCALE_VERSION read: 6 (16%)
TECDOC_OBJECTS.IDX_TECDOC_GUID read: 8 (21%)
TECDOC_OBJECTS.TECDOC_OBJECTS_DATA read: 23 (62%)
*/
SELECT
TDO.TECDOC_GUID AS GUID,
TDO.TECDOC_LOCALE AS LOCALE,
TDO.TECDOC_VERSION AS VERSION,
TDO.DATA AS DATA
FROM PUBLIC.TECDOC_OBJECTS TDO
/* PUBLIC.IDX_GUID_LOCALE_VERSION: TECDOC_LOCALE = 'de'
AND TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0')
*/
/* WHERE (TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
AND (TDO.TECDOC_LOCALE = 'de')
*/
/* scanCount: 287385 */
LEFT OUTER JOIN PUBLIC.TECDOC_OBJECTS TDO1
/* PUBLIC.IDX_GUID_LOCALE_VERSION: TECDOC_GUID = TDO.TECDOC_GUID
AND TECDOC_LOCALE = TDO.TECDOC_LOCALE
AND TECDOC_VERSION > TDO.TECDOC_VERSION
*/
ON (TDO.TECDOC_VERSION < TDO1.TECDOC_VERSION)
AND ((TDO.TECDOC_GUID = TDO1.TECDOC_GUID)
AND (TDO.TECDOC_LOCALE = TDO1.TECDOC_LOCALE))
/* scanCount: 4 */
WHERE (TDO.TECDOC_LOCALE = 'de')
AND ((TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
AND (TDO1.ID IS NULL))
/*
total: 11891
TECDOC_OBJECTS.IDX_GUID_LOCALE_VERSION read: 11884 (99%)
TECDOC_OBJECTS.TECDOC_OBJECTS_DATA read: 7 (0%)
*/
After I ran ANALYZE command the execution plan (scanCount really high):
SELECT
TDO.TECDOC_GUID AS GUID,
TDO.TECDOC_LOCALE AS LOCALE,
TDO.TECDOC_VERSION AS VERSION,
TDO.DATA AS DATA
FROM PUBLIC.TECDOC_OBJECTS TDO
/* PUBLIC.IDX_GUID_LOCALE_VERSION: TECDOC_LOCALE = 'de'
AND TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0')
*/
/* WHERE (TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
AND (TDO.TECDOC_LOCALE = 'de')
*/
/* scanCount: 287385 */
LEFT OUTER JOIN PUBLIC.TECDOC_OBJECTS TDO1
/* PUBLIC.IDX_GUID_LOCALE_VERSION: TECDOC_GUID = TDO.TECDOC_GUID
AND TECDOC_LOCALE = TDO.TECDOC_LOCALE
AND TECDOC_VERSION > TDO.TECDOC_VERSION
*/
ON (TDO.TECDOC_VERSION < TDO1.TECDOC_VERSION)
AND ((TDO.TECDOC_GUID = TDO1.TECDOC_GUID)
AND (TDO.TECDOC_LOCALE = TDO1.TECDOC_LOCALE))
/* scanCount: 4 */
WHERE (TDO.TECDOC_LOCALE = 'de')
AND ((TDO.TECDOC_GUID IN('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'))
AND (TDO1.ID IS NULL))
/*
total: 11891
TECDOC_OBJECTS.IDX_GUID_LOCALE_VERSION read: 11884 (99%)
TECDOC_OBJECTS.TECDOC_OBJECTS_DATA read: 7 (0%)
*/
But on my developer laptop, after ANALYZE the query is still fast. Somehow H2 uses the wrong index (as it can only use one index per join, according to documentation).
Does anyone has any suggestions?

Your query is not complex. I think the key aspect of it is in the where condition.
WHERE tdo1.id IS NULL
AND tdo.TECDOC_GUID in ('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6',
'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0')
AND tdo.TECDOC_LOCALE = 'de';
For some reason H2 is using the index the wrong way. I would try to rephrase the condition, and see how H2's SQL optimizer works it out.
For example, you can try option #1:
SELECT
... -- columns, FROM, and OUTER JOIN here
WHERE tdo.TECDOC_GUID = 'GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6'
AND tdo.TECDOC_LOCALE = 'de'
OR tdo.TECDOC_GUID = 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'
AND tdo.TECDOC_LOCALE = 'de'
AND tdo1.id IS NULL
Or you can decouple the query in two to make sure it uses the index, as in option #2:
SELECT
... -- columns, FROM, and OUTER JOIN here
WHERE tdo.TECDOC_GUID = 'GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6'
AND tdo.TECDOC_LOCALE = 'de'
AND tdo1.id IS NULL
UNION ALL
SELECT
... -- columns, FROM, and OUTER JOIN here
WHERE tdo.TECDOC_GUID = 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0'
AND tdo.TECDOC_LOCALE = 'de'
AND tdo1.id IS NULL
This way, you are using equality only when searching. This one is much simpler to understand to the SQL optimizer. Note the use of union all that is cheaper than union.

what somehow solved the issue is that I used
USE INDEX
to specify which index it should use.
Here is the Query which uses a certain index by force (or index hint http://www.h2database.com/html/performance.html#database_performance_tuning).
SELECT tdo.TECDOC_GUID as guid, tdo.TECDOC_LOCALE as locale , tdo.TECDOC_VERSION as version, tdo.DATA as data
FROM TECDOC_OBJECTS tdo USE INDEX (IDX_TECDOC_GUID)
LEFT OUTER JOIN TECDOC_OBJECTS tdo1
ON (
tdo.TECDOC_GUID = tdo1.TECDOC_GUID AND
tdo.TECDOC_LOCALE = tdo1.TECDOC_LOCALE AND
tdo.TECDOC_VERSION < tdo1.TECDOC_VERSION)
WHERE tdo1.id IS NULL
AND tdo.TECDOC_GUID in ('GUID-F2F77CE5-D8F5-4286-9A30-8FD500F735F6', 'GUID-41FD28DC-63C0-44D0-B8AE-0FCF7C78CEB0')
AND tdo.TECDOC_LOCALE = 'de';
This will solve this issue. If you use it with Java and Hibernate, be aware that the parser of H2 does not understand USE INDEX in versions before 1.4.194. I had the issue, that with version 1.4.194 some other issues came up. And I deleted some combined Indexes in my table.
Cheers

Related

How to write clickhouse SQL correctly?

SQL can be execute on Oracle, but not on clickhouse:
SELECT *
FROM PART, PARTSUPP
WHERE P_PARTKEY = PS_PARTKEY
AND PS_SUPPLYCOST = (
SELECT MIN(PS_SUPPLYCOST)
FROM PARTSUPP
WHERE P_PARTKEY = PS_PARTKEY
)
Execption:
Missing columns: 'P_PARTKEY' while processing query: 'SELECT min(PS_SUPPLYCOST)...
any help will be appreciated.
thank you.
correlated subquery SQL:
SELECT
*
FROM
(
SELECT
S_ACCTBAL,
S_NAME,
N_NAME,
P_PARTKEY,
P_MFGR ,
S_ADDRESS,
S_PHONE,
S_COMMENT
FROM
PART,
SUPPLIER,
PARTSUPP,
NATION,
REGION
WHERE
P_PARTKEY = PS_PARTKEY
AND S_SUPPKEY = PS_SUPPKEY
AND P_SIZE = 25
AND P_TYPE LIKE '%COPPER'
AND S_NATIONKEY = N_NATIONKEY
AND N_REGIONKEY = R_REGIONKEY
AND R_NAME = 'ASIA'
AND PS_SUPPLYCOST = (
SELECT
MIN(PS_SUPPLYCOST)
FROM
PARTSUPP,
SUPPLIER,
NATION,
REGION
WHERE
P_PARTKEY = PS_PARTKEY
AND S_SUPPKEY = PS_SUPPKEY
AND S_NATIONKEY = N_NATIONKEY
AND N_REGIONKEY = R_REGIONKEY
AND R_NAME = 'ASIA' )
ORDER BY
S_ACCTBAL DESC,
N_NAME,
S_NAME,
P_PARTKEY )
WHERE
ROWNUM <= 100;
for Clickhouse:
SELECT
*
from
(
SELECT
s.S_ACCTBAL AS S_ACCTBAL,
s.S_NAME AS S_NAME,
n.N_NAME AS N_NAME,
p.P_PARTKEY AS P_PARTKEY,
p.P_MFGR AS P_MFGR,
s.S_ADDRESS AS S_ADDRESS,
s.S_PHONE AS S_PHONE,
s.S_COMMENT AS S_COMMENT
FROM
PART AS p,
PARTSUPP AS ps,
SUPPLIER AS s,
NATION AS n,
REGION AS r,
(
SELECT
P_PARTKEY,
MIN(PS_SUPPLYCOST) AS PS_SUPPLYCOST
FROM
PARTSUPP,
PART,
SUPPLIER,
NATION,
REGION
WHERE
P_PARTKEY = PS_PARTKEY
AND S_SUPPKEY = PS_SUPPKEY
AND S_NATIONKEY = N_NATIONKEY
AND N_REGIONKEY = R_REGIONKEY
AND R_NAME = 'ASIA'
GROUP BY
P_PARTKEY) pps
WHERE
p.P_PARTKEY = pps.P_PARTKEY
AND ps.PS_SUPPLYCOST = pps.PS_SUPPLYCOST
AND p.P_PARTKEY = ps.PS_PARTKEY
AND s.S_SUPPKEY = ps.PS_SUPPKEY
AND p.P_SIZE = 25
AND p.P_TYPE LIKE '%COPPER'
AND s.S_NATIONKEY = n.N_NATIONKEY
AND n.N_REGIONKEY = r.R_REGIONKEY
AND r.R_NAME = 'ASIA')
ORDER BY
S_ACCTBAL DESC,
N_NAME,
S_NAME,
P_PARTKEY
LIMIT 100;

Subselect OrderBy first row

I'm doing the below query but on the last subquery (mileage) I'm getting the following error due to the ORDER BY: "ORA-00907: missing right parenthesis", if I remove the ORDER BY it works well.
SELECT /* DATE OF THE ROUTE */
{GPS}.[DateTime],
/* ROUTE DESCRIPTION */
{Route}.[Description],
/* NAME OF THE DRIVER */
{Driver}.[Name],
/* VEHICLE LICENSE PLATE */
{Vehicle}.[Registration],
/* QUANTITY OF STOPS */
(SELECT COUNT({RouteStop}.[RouteId])
FROM {RouteStop}
WHERE {RouteStop}.[RouteId] = {GPS}.[RouteId]) AS StopCount,
/* AMOUNT OF FUEL */
(SELECT SUM(FUEL.[Value])
FROM {GPS} FUEL
WHERE {GPS}.[RouteId] = FUEL.[RouteId]
AND FUEL.[EventTypeId] = 23) FuelAmount, /* Event Fuel */
/* ROUTE STARTDATETIME */
{GPS}.[DateTime] AS ROUTESTARTDATETIME,
/* ROUTE ENDDATETIME */
(SELECT ROUTEENDDATETIME.[DateTime]
FROM {GPS} ROUTEENDDATETIME
WHERE {GPS}.[RouteId] = ROUTEENDDATETIME.[RouteId]
AND ROUTEENDDATETIME.[EventTypeId] = 5 /* Event Route Completed */
AND ROWNUM = 1) AS ROUTEEND,
/* INITIAL MILEAGE */
(SELECT MILEAGEBEGIN.[Value]
FROM {GPS} MILEAGEBEGIN
WHERE {GPS}.[RouteId] = MILEAGEBEGIN.[RouteId]
AND MILEAGEBEGIN.[EventTypeId] = 21 /* Event Mileage */
AND ROWNUM = 1
ORDER BY MILEAGEBEGIN.[DateTime]
) AS INITIALMILEAGE
FROM {GPS}
INNER JOIN {Route}
ON {GPS}.[RouteId] = {Route}.[Id]
INNER JOIN {Driver}
ON {GPS}.[DriverId] = {Driver}.[Id]
INNER JOIN {Availability}
ON {Driver}.[Id] = {Availability}.[DriverId]
INNER JOIN {Vehicle}
ON {Availability}.[VehicleId] = {Vehicle}.[Id]
WHERE {GPS}.[EventTypeId] = 3 /* Event RouteStarted */
I tried in the following way but I get this error: "ORA-00936: missing expresion".
SELECT /* DATE OF THE ROUTE */
{GPS}.[DateTime],
/* INITIAL MILEAGE */
SELECT TEST,'more test' FROM (SELECT MILEAGEBEGIN.[Value] AS TEST
FROM {GPS} MILEAGEBEGIN
WHERE {GPS}.[RouteId] = MILEAGEBEGIN.[RouteId]
AND MILEAGEBEGIN.[EventTypeId] = 21 /* Event Mileage */
ORDER BY MILEAGEBEGIN.[DateTime] ASC
)
WHERE ROWNUM = 1 AS INITIALMILEAGE
FROM {GPS}
WHERE {GPS}.[EventTypeId] = 3 /* Event RouteStarted */
Remove the brackets,
SELECT /* DATE OF THE ROUTE */
GPS.DateTime,
/* ROUTE DESCRIPTION */
Route.Description,
/* NAME OF THE DRIVER */
Driver.Name,
/* VEHICLE LICENSE PLATE */
Vehicle.Registration,
/* QUANTITY OF STOPS */
(SELECT COUNT(RouteStop.RouteId)
FROM RouteStop
WHERE RouteStop.RouteId = GPS.RouteId) AS StopCount,
/* AMOUNT OF FUEL */
(SELECT SUM(FUEL.Value)
FROM GPS FUEL
WHERE GPS.RouteId = FUEL.RouteId
AND FUEL.EventTypeId = 23) FuelAmount, /* Event Fuel */
/* ROUTE STARTDATETIME */
GPS.DateTime AS ROUTESTARTDATETIME,
/* ROUTE ENDDATETIME */
(SELECT ROUTEENDDATETIME.DateTime
FROM GPS ROUTEENDDATETIME
WHERE GPS.RouteId = ROUTEENDDATETIME.RouteId
AND ROUTEENDDATETIME.EventTypeId = 5 /* Event Route Completed */
AND ROWNUM = 1) AS ROUTEEND,
/* INITIAL MILEAGE */
(SELECT MILEAGEBEGIN.Value
FROM GPS MILEAGEBEGIN
WHERE GPS.RouteId = MILEAGEBEGIN.RouteId
AND MILEAGEBEGIN.EventTypeId = 21 /* Event Mileage */
AND ROWNUM = 1
ORDER BY MILEAGEBEGIN.DateTime
) AS INITIALMILEAGE
FROM GPS
INNER JOIN Route
ON GPS.RouteId = Route.Id
INNER JOIN Driver
ON GPS.DriverId = Driver.Id
INNER JOIN Availability
ON Driver.Id = Availability.DriverId
INNER JOIN Vehicle
ON Availability.VehicleId = Vehicle.Id
WHERE GPS.EventTypeId = 3 /* Event RouteStarted */
The Other query:
SELECT /* DATE OF THE ROUTE */
GPS.DateTime,
/* INITIAL MILEAGE */
( SELECT TEST FROM (SELECT MILEAGEBEGIN.Value AS TEST
FROM GPS MILEAGEBEGIN
WHERE GPS.RouteId = MILEAGEBEGIN.RouteId
AND MILEAGEBEGIN.EventTypeId = 21 /* Event Mileage */
ORDER BY MILEAGEBEGIN.DateTime ASC
)
WHERE ROWNUM = 1) AS INITIALMILEAGE
FROM GPS
WHERE GPS.EventTypeId = 3 /* Event RouteStarted */
Correct query
SELECT /* ROUTEID */
ROUTES.[RouteId] AS ROUTEID,
/* ROUTE STARTDATETIME */
ROUTES.[DateTime] AS ROUTESTARTDATETIME,
/* ROUTE DESCRIPTION */
{Route}.[Description],
/* NAME OF THE DRIVER */
{Driver}.[Name],
/* VEHICLE LICENSE PLATE */
{Vehicle}.[Registration],
/* QUANTITY OF STOPS */
(SELECT COUNT({RouteStop}.[RouteId])
FROM {RouteStop}
WHERE {RouteStop}.[RouteId] = ROUTEID) AS STOPCOUNT,
/* AMOUNT OF FUEL */
(SELECT SUM(FUEL.[Value])
FROM {GPS} FUEL
WHERE ROUTES.[AvailabilityId] = FUEL.[AvailabilityId]
AND TRUNC(ROUTES.[DateTime]) = TRUNC(FUEL.[DateTime])
AND FUEL.[EventTypeId] = #FuelEventTypeId) AS FUELAMOUNT,
/* ROUTE ENDDATETIME */
(SELECT ROUTEENDDATETIME.[DateTime]
FROM {GPS} ROUTEENDDATETIME
WHERE ROUTEID = ROUTEENDDATETIME.[RouteId]
AND ROUTEENDDATETIME.[EventTypeId] = #RouteCompletedEventTypeId
AND ROWNUM = 1) AS ROUTEEND,
/* INITIAL MILEAGE */
(SELECT INITIALMILEAGE
FROM (SELECT MILEAGEBEGIN.[Value] AS INITIALMILEAGE
FROM {GPS} MILEAGEBEGIN
WHERE ROUTES.[AvailabilityId] = MILEAGEBEGIN.[AvailabilityId]
AND TRUNC(ROUTES.[DateTime]) = TRUNC(MILEAGEBEGIN.[DateTime])
AND MILEAGEBEGIN.[EventTypeId] = #MileageEventTypeId
ORDER BY MILEAGEBEGIN.[DateTime] ASC
)
WHERE ROWNUM = 1),
/* FINAL MILEAGE */
(SELECT FINALMILEAGE
FROM (SELECT MILEAGEEND.[Value] AS FINALMILEAGE
FROM {GPS} MILEAGEEND
WHERE ROUTES.[AvailabilityId] = MILEAGEEND.[AvailabilityId]
AND TRUNC(ROUTES.[DateTime]) = TRUNC(MILEAGEEND.[DateTime])
AND MILEAGEEND.[EventTypeId] = #MileageEventTypeId
ORDER BY MILEAGEEND.[DateTime] DESC
)
WHERE ROWNUM = 1)
FROM {GPS} ROUTES
INNER JOIN {Route}
ON ROUTES.[RouteId] = {Route}.[Id]
INNER JOIN {Availability}
ON ROUTES.[AvailabilityID] = {Availability}.[Id]
INNER JOIN {Driver}
ON {Availability}.[DriverId] = {Driver}.[Id]
INNER JOIN {Vehicle}
ON {Availability}.[VehicleId] = {Vehicle}.[Id]
WHERE ROUTES.[EventTypeId] = #RouteStartedEventTypeId

Postgresql Stored Function sometimes executes very slowly

We have a pretty big plpgsql function with an if- and elsif-statement in PostgreSQL 9.4.4
Inside every if-body there are function calls to stable-sql functions.
We call the function in the following way:
SELECT *
from rawdata.getNumbersForUserBasedMetricEventsGroupedByClient('2015-09-28','2015-10-28','{4}'::int[],2,null,null,null,null,null);
The first 4-5 times the function executes quite fast in a about 2.5 seconds, but then suddenly the performance drops rapidly and the execution takes about 7.5 seconds. It stays at that level for all consecutive calls.
We also tried to declare the plpgsql function as stable, but that did not help.
When we call one of the inner stable-sql functions directly, the executions always take about 2.5 seconds.
This is the Schema of the rawdata.metricevent table:
rawdata.metricevent (metriceventid bigint PRIMARY KEY,
metricevent integer,
client integer,
age integer,
country varchar(256),
userideventowner bigint,
contributoruserid bigint,
tournamentid bigint,
eventoccurtime timestamp,
iscounted boolean)
We have a btree index over the eventoccurtime column. Without the btree index the difference is even bigger, the execution sometimes finished in just a few seconds, but sometimes it lasts more than 100 seconds.
Now our questions are: Why is that? What is happening, when the plpgsql function is executed the 5th or 6th time, why is it suddenly taking so long? Btw, the CPU-Load also is very high for this queries.
We also analyzed the query with EXPLAIN ANALYZE and the query planner ALWAYS takes about 0.034ms, but the query execution differs from 2.5 seconds to 7.5 seconds. And it also never is anywhere in between, its either 2.5 seconds or 7.5 seconds.
These are the Main-pgpsql function that has the variable execution times and the stable-sql function below that have constant execution times.
CREATE OR REPLACE FUNCTION rawdata.getNumbersForUserBasedMetricEventsGroupedByClient(pFrom timestamp, pTo timestamp, pMetricEvent integer[], pTimeDomainType integer,
pCountry varchar(100),pAgeFrom integer,pAgeTo integer,pUserlanguage varchar(50),pTournamentlanguage varchar(50))
RETURNS TABLE(dfrom timestamp, x bigint, y bigint, xx bigint, yy bigint)
AS $$
BEGIN
IF pTimeDomainType = 1 THEN
--hours
RETURN QUERY
SELECT * FROM rawdata.getNumbersForUBMetricEventsGroupedByClientPerHours(pFrom,pTo,pMetricEvent,pCountry,pAgeFrom,pAgeTo,pUserLanguage,pTournamentLanguage);
ELSIF pTimeDomainType = 2 THEN
--days
RETURN QUERY
SELECT * FROM rawdata.getNumbersForUBMetricEventsGroupedByClientPerDays(pFrom,pTo,pMetricEvent,pCountry,pAgeFrom,pAgeTo,pUserLanguage,pTournamentLanguage);
ELSIF pTimeDomainType = 3 THEN
--week
RETURN QUERY
SELECT * FROM rawdata.getNumbersForUBMetricEventsGroupedByClientPerWeeks(pFrom,pTo,pMetricEvent,pCountry,pAgeFrom,pAgeTo,pUserLanguage,pTournamentLanguage);
ELSIF pTimeDomainType = 4 THEN
--month
RETURN QUERY
SELECT * FROM rawdata.getNumbersForUBMetricEventsGroupedByClientPerMonths(pFrom,pTo,pMetricEvent,pCountry,pAgeFrom,pAgeTo,pUserLanguage,pTournamentLanguage);
END IF;
END;
$$
LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION rawdata.getNumbersForUBMetricEventsGroupedByClientPerHours(pFrom timestamp, pTo timestamp, pMetricEvent integer[],
pCountry varchar(100),pAgeFrom integer,pAgeTo integer,pUserlanguage varchar(50),pTournamentlanguage varchar(50))
RETURNS TABLE(dfrom timestamp, x bigint, y bigint, xx bigint, yy bigint)
AS $$
SELECT hours timedomain,count(distinct em.userideventowner) as x,count(distinct ef.userideventowner) as y,count(distinct emh.userideventowner) as xx,count(distinct efh.userideventowner) as yy
FROM generate_series
( pFrom::timestamp
, pTo::timestamp + '23 hour'
, '1 hour'::interval) hours
LEFT JOIN rawdata.metricevent e1 ON e1.eventoccurtime >=pFrom
AND e1.eventoccurtime < pTo + '1 day'
AND (e1.metricevent = ANY (pMetricEvent))
AND (e1.country = pCountry OR pCountry is null)
AND (e1.age >= pAgeFrom OR pAgeFrom is null) AND (e1.age <= pAgeTo OR pAgeTo is null)
AND userideventowner >= 110
AND hours = date_trunc('hour',e1.eventoccurtime)
LEFT JOIN rawdata.userlanguage ul ON e1.userideventowner = ul.userideventowner
AND (ul.userlanguage = pUserLanguage OR pUserLanguage is null)
LEFT JOIN rawdata.metricevent ei ON e1.metriceventid = em.metriceventid AND ei.client=1
LEFT JOIN rawdata.metricevent ea ON e1.metriceventid = ef.metriceventid AND ea.client=2
LEFT JOIN rawdata.metricevent ew ON e1.metriceventid = emh.metriceventid AND ew.client=3
LEFT JOIN rawdata.metricevent eww ON e1.metriceventid = efh.metriceventid AND eww.client=4
GROUP BY hours
ORDER BY hours;
$$
LANGUAGE sql STABLE;
CREATE OR REPLACE FUNCTION rawdata.getNumbersForUBMetricEventsGroupedByClientPerDays(pFrom timestamp, pTo timestamp, pMetricEvent integer[],
pCountry varchar(100),pAgeFrom integer,pAgeTo integer,pUserlanguage varchar(50),pTournamentlanguage varchar(50))
RETURNS TABLE(dfrom timestamp, x bigint, y bigint, xx bigint, yy bigint)
AS $$
SELECT days timedomain,count(distinct em.userideventowner) as x,count(distinct ef.userideventowner) as y,count(distinct emh.userideventowner) as xx,count(distinct efh.userideventowner) as yy
FROM generate_series
( pFrom::timestamp
, pTo::timestamp
, '1 day'::interval) days
LEFT JOIN rawdata.metricevent e1 ON e1.eventoccurtime >=pFrom
AND e1.eventoccurtime < pTo + '1 day'
AND (e1.metricevent = ANY (pMetricEvent))
AND (e1.country = pCountry OR pCountry is null)
AND (e1.age >= pAgeFrom OR pAgeFrom is null) AND (e1.age <= pAgeTo OR pAgeTo is null)
AND userideventowner >= 110
AND days = date_trunc('day',e1.eventoccurtime)
LEFT JOIN rawdata.userlanguage ul ON e1.userideventowner = ul.userideventowner
AND (ul.userlanguage = pUserLanguage OR pUserLanguage is null)
LEFT JOIN rawdata.metricevent ei ON e1.metriceventid = em.metriceventid AND ei.client=1
LEFT JOIN rawdata.metricevent ea ON e1.metriceventid = ef.metriceventid AND ea.client=2
LEFT JOIN rawdata.metricevent ew ON e1.metriceventid = emh.metriceventid AND ew.client=3
LEFT JOIN rawdata.metricevent eww ON e1.metriceventid = efh.metriceventid AND eww.client=4
GROUP BY days
ORDER BY days;
$$
LANGUAGE sql STABLE;
CREATE OR REPLACE FUNCTION rawdata.getNumbersForUBMetricEventsGroupedByClientPerWeeks(pFrom timestamp, pTo timestamp, pMetricEvent integer[],
pCountry varchar(100),pAgeFrom integer,pAgeTo integer,pUserlanguage varchar(50),pTournamentlanguage varchar(50))
RETURNS TABLE(dfrom timestamp, x bigint, y bigint, xx bigint, yy bigint)
AS $$
SELECT min(days) timedomain,count(distinct em.userideventowner) as x,count(distinct ef.userideventowner) as y,count(distinct emh.userideventowner) as xx,count(distinct efh.userideventowner) as yy
FROM generate_series
( pFrom::timestamp
, pTo::timestamp
, '1 day'::interval) days
LEFT JOIN rawdata.metricevent e1 ON e1.eventoccurtime >=pFrom
AND e1.eventoccurtime < pTo + '1 day'
AND (e1.metricevent = ANY (pMetricEvent))
AND (e1.country = pCountry OR pCountry is null)
AND (e1.age >= pAgeFrom OR pAgeFrom is null) AND (e1.age <= pAgeTo OR pAgeTo is null)
AND userideventowner >= 110
AND days = date_trunc('day',e1.eventoccurtime)
LEFT JOIN rawdata.userlanguage ul ON e1.userideventowner = ul.userideventowner
AND (ul.userlanguage = pUserLanguage OR pUserLanguage is null)
LEFT JOIN rawdata.metricevent ei ON e1.metriceventid = em.metriceventid AND ei.client=1
LEFT JOIN rawdata.metricevent ea ON e1.metriceventid = ef.metriceventid AND ea.client=2
LEFT JOIN rawdata.metricevent ew ON e1.metriceventid = emh.metriceventid AND ew.client=3
LEFT JOIN rawdata.metricevent eww ON e1.metriceventid = efh.metriceventid AND eww.client=4
GROUP BY EXTRACT(WEEK FROM days)
ORDER BY 1;
$$
LANGUAGE sql STABLE;
CREATE OR REPLACE FUNCTION rawdata.getNumbersForUBMetricEventsGroupedByClientPerMonths(pFrom timestamp, pTo timestamp, pMetricEvent integer[],
pCountry varchar(100),pAgeFrom integer,pAgeTo integer,pUserlanguage varchar(50),pTournamentlanguage varchar(50))
RETURNS TABLE(dfrom timestamp, x bigint, y bigint, xx bigint, yy bigint)
AS $$
SELECT min(days) timedomain,count(distinct em.userideventowner) as x,count(distinct ef.userideventowner) as y,count(distinct emh.userideventowner) as xx,count(distinct efh.userideventowner) as yy
FROM generate_series
( pFrom::timestamp
, pTo::timestamp
, '1 day'::interval) days
LEFT JOIN rawdata.metricevent e1 ON e1.eventoccurtime >=pFrom
AND e1.eventoccurtime < pTo + '1 day'
AND (e1.metricevent = ANY (pMetricEvent))
AND (e1.country = pCountry OR pCountry is null)
AND (e1.age >= pAgeFrom OR pAgeFrom is null) AND (e1.age <= pAgeTo OR pAgeTo is null)
AND userideventowner >= 110
AND days = date_trunc('day',e1.eventoccurtime)
LEFT JOIN rawdata.userlanguage ul ON e1.userideventowner = ul.userideventowner
AND (ul.userlanguage = pUserLanguage OR pUserLanguage is null)LEFT JOIN rawdata.metricevent ei ON e1.metriceventid = em.metriceventid AND ei.client=1
LEFT JOIN rawdata.metricevent ea ON e1.metriceventid = ef.metriceventid AND ea.client=2
LEFT JOIN rawdata.metricevent ew ON e1.metriceventid = emh.metriceventid AND ew.client=3
LEFT JOIN rawdata.metricevent eww ON e1.metriceventid = efh.metriceventid AND eww.client=4
GROUP BY EXTRACT(MONTH FROM days)
ORDER BY 1;
$$
LANGUAGE sql STABLE;
Kind regards, Thomas

Laravel, why do I have so many queries?

Controller
$attendees = Attendee::with('User')->get();
return View::make('admin.attendees.index', compact('attendees'));
Attendee model
public function user() | if( !( $user->hasRole('admin') || $user->hasRole('programmer') ))
{ | return Redirect::to('/');
return $this->belongsTo('User'); |
}
View
#foreach($attendees as $attendee)
<td>{{link_to_route('admin.users.show', $attendee->user->username, $attendee->user->id)}}</td>
#endforeach
223 queries
select * from `users` where `users`.`id` = '4' limit 1600μs
select `roles`.*, `assigned_roles`.`user_id` as `pivot_user_id`, `assigned_roles`.`role_id` as `pivot_role_id` from `roles` inner join `assigned_roles` on `roles`.`id` = `assigned_roles`.`role_id` where `assigned_roles`.`user_id` = '4'630μs
select * from `attendees`1.24ms
select * from `users` where `users`.`id` in ('5', '1', '3', '8', '9', '10')780μs
select * from `users` where `users`.`id` = '5' limit 1680μs
select * from `users` where `users`.`id` = '5' limit 1650μs
select * from `users` where `users`.`id` = '5' limit 1680μs
select * from `users` where `users`.`id` = '5' limit 1590μs
select * from `users` where `users`.`id` = '1' limit 1
<continues like so for each user id>
I am using phpdebugbar to show the queries.
Migration
Schema::table('attendees', function(Blueprint $table) {
$table->foreign('user_id')->references('id')->on('users')
->onDelete('cascade')
->onUpdate('no action');
Am I doing something wrong that is causing the query to be run over and over again?
The eager load should be the name of the relationship function, not the relationship's model, and it's apparently case sensitive:
$attendees = Attendee::with('user')->get();

Laravel Eloquent - Where In All

In Laravel 4.2, I am trying to achieve a query that returns all users, that have all of certain activities. As of now, I have a query that returns all users that have one of many activities:
//$selectedActivities being an array
$userByActivities = User::with('activities')
->whereHas('activities', function($query) use($selectedActivities){
$query->whereIn('id', $selectedActivities);
})->get();
To be more clear: given activities a,b,c. I am looking for all users that have activity a AND b AND c. My query returns all users that have activity a OR b OR c.
Thank you for your help.
EDIT:
The solution offered by lukasgeiter results in following query:
select * from `users` where
(select count(*) from `activities` inner join `activity_user` on `activities`.`id` = `activity_user`.`activity_id` where `activity_user`.`user_id` = `users`.`id` and `id` = '7') >= 1
and (select count(*) from `activities` inner join `activity_user` on `activities`.`id` = `activity_user`.`activity_id` where `activity_user`.`user_id` = `users`.`id` and `id` = '3') >= 1
and (select count(*) from `activities` inner join `activity_user` on `activities`.`id` = `activity_user`.`activity_id` where `activity_user`.`user_id` = `users`.`id` and `id` = '1') >= 1
and (select count(*) from `activities` inner join `activity_user` on `activities`.`id` = `activity_user`.`activity_id` where `activity_user`.`user_id` = `users`.`id` and `id` = '2') >= 1
Whereas the solution offered by Jarek Tkaczyk:
$userByActivities = User::with('activities')
->whereHas('activities', function($query) use($selectedActivities) {
$query->selectRaw('count(distinct id)')->whereIn('id', $selectedActivities);
}, '=', count($selectedActivities))->get();
for a similar request, results in following query:
select * from `users` where (select count(distinct id) from `activities`
inner join `activity_user` on `activities`.`id` = `activity_user`.`activity_id`
where `activity_user`.`user_id` = `users`.`id` and `id` in ('7', '3', '1', '2')) = 4
You'll have to add multiple whereHas for that:
$query = User::with('activities');
foreach($selectedActivities as $activityId){
$query->whereHas('activities', function($q) use ($activityId){
$q->where('id', $activityId);
});
}
$userByActivities = $query->get();
If you are getting Cardinality violation: 1241 Operand should contain 2 column(s) the problem is the nested selectCount adds to the normal select count(*) instead of overriding the existing select, so changing to $query->distinct()->whereIn('id', $selectedActivities); did the trick for me, or changing to $query->select(DB::raw(count(distinct id)))

Resources