Slow aggregation on big neo4j graph

Slow aggregation on big neo4j graph - performance

Configuration:
Windows 8.1
neo4j-enterprise-2.2.0-M03
cache type: hpc
8Gb RAM
6Gb for JVM Heap (wrapper.java.initmemory=6144 wrapper.java.maxmemory=6144)
5Gb out of 6Gb of JVM Heap for mapped memory (dbms.pagecache.memory=5G)
Model:
Model represents how users navigate through website.
27 522 896 nodes (394Mb)
111 294 796 relationships (3609Mb)
33 906 363 properties (1326Mb)
293 (:Page) nodes
27522603 (:PageView) nodes
0 (:User) nodes (not load yet)
each (:PageView) node connected with (:Page) node
each (:PageView) node connected with next (:PageView) node
each (:PageView) node connected with (:User) node (not yet)
Query
match (:Page {Name:'#########.aspx'})<-[:At]-(:PageView)-[:Next]->(:PageView)-[:At]->(p:Page)
return p.Name,count(*) as count
order by count desc
limit 10;
Profile info:
+------------------------------------------------+
| p.Name | count |
+------------------------------------------------+
| "#####################.aspx" | 5172680 |
| "###############.aspx" | 3846455 |
| "#########.aspx" | 3579022 |
| "###########.aspx" | 3051043 |
| "#############################.aspx" | 1713004 |
| "############.aspx" | 1373928 |
| "############.aspx" | 1338063 |
| "#####.aspx" | 1285447 |
| "###################.aspx" | 884077 |
| "##############.aspx" | 759665 |
+------------------------------------------------+
10 rows
195363 ms
Compiler CYPHER 2.2
Planner COST
Projection(0)
|
+Top
|
+EagerAggregation
|
+Projection(1)
|
+Filter(0)
|
+Expand(All)(0)
|
+Filter(1)
|
+Expand(All)(1)
|
+Filter(2)
|
+Expand(All)(2)
|
+NodeUniqueIndexSeek
+---------------------+---------------+----------+----------+-------------------------------------------+--------------------------------------------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+---------------------+---------------+----------+----------+-------------------------------------------+--------------------------------------------------+
| Projection(0) | 881 | 10 | 0 | FRESHID105, FRESHID110, count, p.Name | p.Name, count |
| Top | 881 | 10 | 0 | FRESHID105, FRESHID110 | { AUTOINT1}; |
| EagerAggregation | 881 | 173 | 0 | FRESHID105, FRESHID110 | |
| Projection(1) | 776404 | 35941815 | 71883630 | FRESHID105, p | |
| Filter(0) | 776404 | 35941815 | 35941815 | p | (NOT(anon[38] == anon[78]) AND hasLabel(p:Page)) |
| Expand(All)(0) | 776404 | 35941815 | 49287436 | p | ()-[:At]->(p) |
| Filter(1) | 384001 | 13345621 | 13345621 | | hasLabel(anon[67]:PageView) |
| Expand(All)(1) | 384001 | 13345621 | 19478500 | | ()-[:Next]->() |
| Filter(2) | 189923 | 6132879 | 6132879 | | hasLabel(anon[46]:PageView) |
| Expand(All)(2) | 189923 | 6132879 | 6132880 | | ()<-[:At]-() |
| NodeUniqueIndexSeek | 1 | 1 | 1 | | :Page(Name) |
+---------------------+---------------+----------+----------+-------------------------------------------+--------------------------------------------------+
Total database accesses: 202202762
Query without unnecessary labels
match (:Page {Name:'Dashboard.aspx'})<-[:At]-()-[:Next]->()-[:At]->(p)
return p.Name,count(*) as count
order by count desc
limit 10;
Profile info:
+------------------------------------------------+
| p.Name | count |
+------------------------------------------------+
| "#####################.aspx" | 5172680 |
| "###############.aspx" | 3846455 |
| "#########.aspx" | 3579022 |
| "###########.aspx" | 3051043 |
| "#############################.aspx" | 1713004 |
| "############.aspx" | 1373928 |
| "############.aspx" | 1338063 |
| "#####.aspx" | 1285447 |
| "###################.aspx" | 884077 |
| "##############.aspx" | 759665 |
+------------------------------------------------+
10 rows
166751 ms
Compiler CYPHER 2.2
Planner COST
Projection(0)
|
+Top
|
+EagerAggregation
|
+Projection(1)
|
+Filter
|
+Expand(All)(0)
|
+Expand(All)(1)
|
+Expand(All)(2)
|
+NodeUniqueIndexSeek
+---------------------+---------------+----------+----------+-----------------------------------------+---------------------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+---------------------+---------------+----------+----------+-----------------------------------------+---------------------------+
| Projection(0) | 881 | 10 | 0 | FRESHID82, FRESHID87, count, p.Name | p.Name, count |
| Top | 881 | 10 | 0 | FRESHID82, FRESHID87 | { AUTOINT1}; |
| EagerAggregation | 881 | 173 | 0 | FRESHID82, FRESHID87 | |
| Projection(1) | 776388 | 35941815 | 71883630 | FRESHID82, p | |
| Filter | 776388 | 35941815 | 0 | p | NOT(anon[38] == anon[60]) |
| Expand(All)(0) | 776388 | 35941815 | 49287436 | p | ()-[:At]->(p) |
| Expand(All)(1) | 383997 | 13345621 | 19478500 | | ()-[:Next]->() |
| Expand(All)(2) | 189923 | 6132879 | 6132880 | | ()<-[:At]-() |
| NodeUniqueIndexSeek | 1 | 1 | 1 | | :Page(Name) |
+---------------------+---------------+----------+----------+-----------------------------------------+---------------------------+
Total database accesses: 146782447
Message.log
Question
How can I perform this query much faster? (more RAM, refactor query, distributed cache, use another language/shell/method, ...)
UPD:
Profile info for last query in answer
neo4j-sh (?)$ profile match (:Page {Name:'Dashboard.aspx'})<-[:At]-()-[:Next]->()-[:At]->(p)
with p,count(*) as count
order by count desc
limit 10 return p.Name, count;
+------------------------------------------------+
| p.Name | count |
+------------------------------------------------+
| "OutgoingDocumentsList.aspx" | 5172680 |
| "DocumentPreview.aspx" | 3846455 |
| "Dashboard.aspx" | 3579022 |
| "ActualTasks.aspx" | 3051043 |
| "DocumentFillMissingRequisites.aspx" | 1713004 |
| "EditDocument.aspx" | 1373928 |
| "PaymentsList.aspx" | 1338063 |
| "Login.aspx" | 1285447 |
| "ReportingRequisites.aspx" | 884077 |
| "ContractorInfo.aspx" | 759665 |
+------------------------------------------------+
10 rows
151328 ms
Compiler CYPHER 2.2
Planner COST
Projection
|
+Top
|
+EagerAggregation
|
+Filter
|
+Expand(All)(0)
|
+Expand(All)(1)
|
+Expand(All)(2)
|
+NodeUniqueIndexSeek
+---------------------+---------------+----------+----------+------------------+---------------------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+---------------------+---------------+----------+----------+------------------+---------------------------+
| Projection | 881 | 10 | 20 | count, p, p.Name | p.Name, count |
| Top | 881 | 10 | 0 | count, p | { AUTOINT1}; count |
| EagerAggregation | 881 | 173 | 0 | count, p | p |
| Filter | 776388 | 35941815 | 0 | p | NOT(anon[38] == anon[60]) |
| Expand(All)(0) | 776388 | 35941815 | 49287436 | p | ()-[:At]->(p) |
| Expand(All)(1) | 383997 | 13345621 | 19478500 | | ()-[:Next]->() |
| Expand(All)(2) | 189923 | 6132879 | 6132880 | | ()<-[:At]-() |
| NodeUniqueIndexSeek | 1 | 1 | 1 | | :Page(Name) |
+---------------------+---------------+----------+----------+------------------+---------------------------+
Total database accesses: 74898837

As I mentioned before, in your other question, if you can write a Java based server extension you can do it pretty easily.
// initialize counters
Map<Node,AtomicInteger> pageCounts = new HashMap<>(300);
for (Node page : graphDb.findNode(Page)) pageCounts.put(page,new AtomicInteger());
// find start page
Label Page = DynamicLabel.label("Page");
Node page = graphDB.findNode(Page,"Name",pageName).iterator().next();
// follow page-view relationships
for (Relationship at : page.getRelationships(At, INCOMING)) {
// follow singular next relationship
Relationship at2 = at.getStartNode().getSingleRelationship(Next,OUTGOING);
if (at2==null) continue;
// follow singular page-view relationship to end-page
Node page2 = at2.getSingleRelationship(At,OUTGOING).getEndNode();
// increment counter
pageCounts.get(page2).incrementAndGet();
}
// sort pages by count descending
List pages = new ArrayList(pageCounts.entrySet())
Collections.sort(pages,new Comparator<Map.Entry<Node,Integer>>() {
public int compare(Map.Entry<Node,Integer> e1, Map.Entry<Node,Integer> e2) {
return - Integer.compare(e1.getValue(),e2.getValue());
}
});
// return top 10
return pages.subList(0,10);
For Cypher I would try something like this:
match (:Page {Name:'#########.aspx'})<-[:At]-(pv:PageView)
WITH distinct pv
MATCH (pv)-[:Next]->(pv2:PageView)
with distinct pv2
match (pv2)-[:At]->(p:Page)
return p.Name,count(*) as count
order by count desc
limit 10;
Update
I wrote a test for it and ran it on my bigger linux machine, the results there are much more sensible: between 1.6s in Java and 5s max in Cypher.
Here is the code and the results: https://gist.github.com/jexp/94f75ddb849f8c41c97c
In Cypher:
-------------------
match (:Page {Name:'Page1'})<-[:At]-()-[:Next]->()-[:At]->(p)
return p.Name,count(*) as count
order by count desc
limit 10;
+-------------------+
| p.Name | count |
+-------------------+
| "Page169" | 975 |
| "Page125" | 959 |
| "Page106" | 955 |
| "Page274" | 951 |
| "Page176" | 947 |
| "Page241" | 944 |
| "Page30" | 942 |
| "Page44" | 938 |
| "Page1" | 938 |
| "Page118" | 938 |
+-------------------+
10 rows
in 3212 ms
[Compiler CYPHER 2.2
Planner COST
+---------------------+---------------+--------+--------+--------------------------+---------------------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+---------------------+---------------+--------+--------+--------------------------+---------------------------+
| Top | 488 | 10 | 0 | FRESHID71, FRESHID76 | { AUTOINT1}; |
| EagerAggregation | 488 | 300 | 0 | FRESHID71, FRESHID76 | |
| Projection | 238460 | 264828 | 529656 | FRESHID71, p | |
| Filter | 238460 | 264828 | 0 | p | NOT(anon[29] == anon[51]) |
| Expand(All)(0) | 238460 | 264828 | 529656 | p | ()-[:At]->(p) |
| Expand(All)(1) | 238460 | 264828 | 778522 | | ()-[:Next]->() |
| Expand(All)(2) | 476922 | 513694 | 513695 | | ()<-[:At]-() |
| NodeUniqueIndexSeek | 1 | 1 | 1 | | :Page(Name) |
+---------------------+---------------+--------+--------+--------------------------+---------------------------+
Total database accesses: 2351530]
And in Java:
-------------------
Java took 1618 ms
Node[169]=975
Node[125]=959
Node[106]=955
Node[274]=951
Node[176]=947
Node[241]=944
Node[30]=942
Node[1]=938
Node[44]=938
Node[118]=938
Something you can also do to speed up your Cypher query, is to only aggregate on the nodes, and only return the page.Name property for the last 10 rows, much faster.
match (:Page {Name:'Page1'})<-[:At]-()-[:Next]->()-[:At]->(p)
with p,count(*) as count
order by count desc
limit 10 return p.Name, count

Related

Pivot Table in Hive and Create Multiple Columns for Unique Combinations

I want to pivot the following table
| ID | Code | date | qty |
| 1 | A | 1/1/19 | 11 |
| 1 | A | 2/1/19 | 12 |
| 2 | B | 1/1/19 | 13 |
| 2 | B | 2/1/19 | 14 |
| 3 | C | 1/1/19 | 15 |
| 3 | C | 3/1/19 | 16 |
into
| ID | Code | mth_1(1/1/19) | mth_2(2/1/19) | mth_3(3/1/19) |
| 1 | A | 11 | 12 | 0 |
| 2 | B | 13 | 14 | 0 |
| 3 | C | 15 | 0 | 16 |
I am new to hive, i am not sure how to implement it.
NOTE: I don't want to do mapping because my month values change over time.

Mariadb 2 explains plan : with Using join buffer and without

I run same query in 2 environnements with huge performance différence : 0.015 sec vs 25sec.
Exlain plan :
+------+-------------+---------------+--------+------------------------------------+---------+---------+---------------------------------------------------------------------------------------------------------------------------+------+----------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+---------------+--------+------------------------------------+---------+---------+---------------------------------------------------------------------------------------------------------------------------+------+----------+---------------------------------+
| 1 | SIMPLE | company1_ | const | PRIMARY | PRIMARY | 152 | const | 1 | 100.00 | Using temporary; Using filesort |
| 1 | SIMPLE | user2_ | ref | PRIMARY | PRIMARY | 152 | const | 1032 | 100.00 | Using where |
| 1 | SIMPLE | vacationpr5_ | eq_ref | PRIMARY | PRIMARY | 304 | user2_.ID_COMPANY_VACATION_PROFILE,.user2_.ID_VACATION_PROFILE | 1 | 100.00 | Using index |
| 1 | SIMPLE | vacationac0_ | ref | PRIMARY,I_VACATION_ACCUMULATION_EA | PRIMARY | 304 | const,.user2_.ID_USER | 4 | 100.00 | Using where |
| 1 | SIMPLE | vacationty3_ | eq_ref | PRIMARY | PRIMARY | 304 | const,.vacationac0_.ID_VACATION_TYPE | 1 | 100.00 | Using where |
| 1 | SIMPLE | vacationst6_ | eq_ref | PRIMARY | PRIMARY | 608 | user2_.ID_COMPANY_VACATION_PROFILE,.user2_.ID_VACATION_PROFILE,const,.vacationac0_.ID_VACATION_TYPE | 1 | 100.00 | Using where |
| 1 | SIMPLE | translatio9_ | eq_ref | PRIMARY | PRIMARY | 919 | vacationty3_.ID_COMPANY_TRANSLATION,.vacationty3_.ID_TRANSLATION | 1 | 100.00 | Using index |
| 1 | SIMPLE | descriptio10_ | eq_ref | PRIMARY, | PRIMARY | 951 | vacationty3_.ID_COMPANY_TRANSLATION,.vacationty3_.ID_TRANSLATION,const | 1 | 100.00 | Using where |
| 1 | SIMPLE | listvalue4_ | ALL | NULL | NULL | NULL | NULL | 5284 | 100.00 | Using where |
| 1 | SIMPLE | translatio7_ | eq_ref | PRIMARY | PRIMARY | 919 | listvalue4_.ID_COMPANY_TRANSLATION,.listvalue4_.ID_TRANSLATION | 1 | 100.00 | Using index |
| 1 | SIMPLE | descriptio8_ | eq_ref | PRIMARY | PRIMARY | 951 | listvalue4_.ID_COMPANY_TRANSLATION,.listvalue4_.ID_TRANSLATION,const | 1 | 100.00 | Using where |
+------+-------------+---------------+--------+------------------------------------+---------+---------+---------------------------------------------------------------------------------------------------------------------------+------+----------+---------------------------------+
next explain plan :
+------+-------------+---------------+--------+------------------------------------+---------+---------+---------------------------------------------------------------------------------------------------------------------------------------+------+----------+-------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+---------------+--------+------------------------------------+---------+---------+---------------------------------------------------------------------------------------------------------------------------------------+------+----------+-------------------------------------------------+
| 1 | SIMPLE | company1_ | const | PRIMARY | PRIMARY | 152 | const | 1 | 100.00 | Using temporary; Using filesort |
| 1 | SIMPLE | user2_ | ref | PRIMARY, | PRIMARY | 152 | const | 1050 | 100.00 | Using where |
| 1 | SIMPLE | vacationpr5_ | eq_ref | PRIMARY | PRIMARY | 304 | validation2.user2_.ID_COMPANY_VACATION_PROFILE,validation2.user2_.ID_VACATION_PROFILE | 1 | 100.00 | Using index |
| 1 | SIMPLE | vacationac0_ | ref | PRIMARY,I_VACATION_ACCUMULATION_EA | PRIMARY | 304 | const,validation2.user2_.ID_USER | 5 | 100.00 | Using where |
| 1 | SIMPLE | vacationty3_ | eq_ref | PRIMARY | PRIMARY | 304 | const,validation2.vacationac0_.ID_VACATION_TYPE | 1 | 100.00 | Using where |
| 1 | SIMPLE | vacationst6_ | eq_ref | PRIMARY | PRIMARY | 608 | validation2.user2_.ID_COMPANY_VACATION_PROFILE,validation2.user2_.ID_VACATION_PROFILE,const,validation2.vacationac0_.ID_VACATION_TYPE | 1 | 100.00 | Using where |
| 1 | SIMPLE | translatio9_ | eq_ref | PRIMARY | PRIMARY | 919 | validation2.vacationty3_.ID_COMPANY_TRANSLATION,validation2.vacationty3_.ID_TRANSLATION | 1 | 100.00 | Using index |
| 1 | SIMPLE | descriptio10_ | eq_ref | PRIMARY, | PRIMARY | 951 | validation2.vacationty3_.ID_COMPANY_TRANSLATION,validation2.vacationty3_.ID_TRANSLATION,const | 1 | 100.00 | Using where |
| 1 | SIMPLE | listvalue4_ | ALL | NULL | NULL | NULL | NULL | 5282 | 100.00 | Using where; Using join buffer (flat, BNL join) |
| 1 | SIMPLE | translatio7_ | eq_ref | PRIMARY | PRIMARY | 919 | validation2.listvalue4_.ID_COMPANY_TRANSLATION,validation2.listvalue4_.ID_TRANSLATION | 1 | 100.00 | Using index |
| 1 | SIMPLE | descriptio8_ | eq_ref | PRIMARY, | PRIMARY | 951 | validation2.listvalue4_.ID_COMPANY_TRANSLATION,validation2.listvalue4_.ID_TRANSLATION,const | 1 | 100.00 | Using where |
+------+-------------+---------------+--------+------------------------------------+---------+---------+---------------------------------------------------------------------------------------------------------------------------------------+------+----------+-------------------------------------------------+
How I can force to use join buffer (flat, BNL join) the first environment is the production one and has more memory and CPU.
In first environment :
join_buffer_size............ 16777216
join_buffer_space_limit..... 2097152
In second environment :
join_buffer_size............ 262144
join_buffer_space_limit..... 2097152
Is there any link/ratio between join_buffer_size and join_buffer_space_limit?
We configure 16Mo on join_buffer_size because it is a mysqlTuner hint.

I set join_buffer_space_limit at 128Mo and it resolves performance issue.
So mysqlTuner doesn't give hint for this configuration key.
SET GLOBAL join_buffer_space_limit = 1024 * 1024 * 128;
It takes time (hour) to improve performances.
https://mariadb.com/kb/en/library/multi-range-read-optimization/

Buffers used by WINDOW SORT operation

I have a query with the following execution plan:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 21741 |00:00:11.38 | 150K| 1088 | | | |
| 1 | SORT AGGREGATE | | 46072 | 1 | 46072 |00:00:02.92 | 138K| 241 | | | |
| 2 | FIRST ROW | | 46072 | 1 | 3761 |00:00:02.83 | 138K| 241 | | | |
|* 3 | INDEX RANGE SCAN (MIN/MAX) | VERP_VIG_VEHICLE_STAGES_N2 | 46072 | 1 | 3761 |00:00:02.79 | 138K| 241 | | | |
|* 4 | HASH JOIN RIGHT OUTER | | 1 | 37010 | 21741 |00:00:11.38 | 150K| 1088 | 3272K| 1218K| 3302K (0)|
| 5 | VIEW | | 1 | 7402 | 23548 |00:00:11.17 | 147K| 1088 | | | |
| 6 | WINDOW SORT | | 1 | 7402 | 23548 |00:00:10.82 | 79621 | 1088 | 4801K| 915K| 4267K (0)|
|* 7 | HASH JOIN RIGHT OUTER | | 1 | 7402 | 23548 |00:00:07.84 | 8837 | 847 | 1599K| 1599K| 996K (0)|
| 8 | TABLE ACCESS FULL | VERP_OTM_PS_CONTROL_TABLE | 1 | 5 | 5 |00:00:00.01 | 39 | 0 | | | |
|* 9 | FILTER | | 1 | | 23548 |00:00:07.80 | 8798 | 847 | | | |
|* 10 | HASH JOIN RIGHT OUTER | | 1 | 7402 | 71904 |00:00:07.76 | 8798 | 847 | 1421K| 1421K| 1756K (0)|
| 11 | VIEW | | 1 | 4534 | 4554 |00:00:00.01 | 27 | 0 | | | |
|* 12 | HASH JOIN | | 1 | 4534 | 4554 |00:00:00.01 | 27 | 0 | 1888K| 1888K| 1596K (0)|
| 13 | INDEX FULL SCAN | VERP_VPS_SUPPLY_VVP_N1 | 1 | 27 | 27 |00:00:00.01 | 1 | 0 | | | |
| 14 | INDEX FULL SCAN | VERP_VPS_SUPPLY_VVVP_N1 | 1 | 4534 | 4554 |00:00:00.01 | 26 | 0 | | | |
|* 15 | HASH JOIN | | 1 | 37010 | 71904 |00:00:07.67 | 8771 | 847 | 1245K| 1245K| 1722K (0)|
| 16 | SORT UNIQUE | | 1 | 37010 | 1586 |00:00:00.05 | 3279 | 0 | 124K| 124K| 110K (0)|
| 17 | TABLE ACCESS FULL | VERP_OTM_STAGED_VONS | 1 | 37010 | 21741 |00:00:00.02 | 3279 | 0 | | | |
| 18 | TABLE ACCESS BY INDEX ROWID BATCHED| VERP_VIG_VEHICLES | 1 | 246K| 36104 |00:00:07.53 | 5492 | 847 | | | |
|* 19 | INDEX RANGE SCAN | VERP_VIG_VEHICLES_N22 | 1 | 246K| 36104 |00:00:07.38 | 891 | 838 | | | |
| 20 | VIEW | | 1 | 37010 | 21741 |00:00:00.12 | 3279 | 0 | | | |
| 21 | WINDOW SORT | | 1 | 37010 | 21741 |00:00:00.11 | 3279 | 0 | 1612K| 624K| 1432K (0)|
| 22 | TABLE ACCESS FULL | VERP_OTM_STAGED_VONS | 1 | 37010 | 21741 |00:00:00.03 | 3279 | 0 | | | |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("S"."VIN"=:B1 AND "S"."STAGE_CODE"='YARD_RECEIPT')
4 - access("VINS"."VIN_SEQUENCE"="VONS"."VON_SEQUENCE" AND "VINS"."PORT_CODE"="VONS"."PORT_CODE" AND "VINS"."INT_COLOR_CODE"="VONS"."INT_COLOR_CODE" AND
"VINS"."EXT_COLOR_CODE"="VONS"."EXT_COLOR_CODE" AND "VINS"."SPEC_CODE"="VONS"."SPEC_CODE" AND "VINS"."OPTION_CODE"="VONS"."OPTION_CODE" AND
"VINS"."MODEL_CODE"="VONS"."MODEL_CODE")
7 - access("C"."PORT"=CASE "VVV"."VEHICLE_SOURCE" WHEN 'SIA' THEN '020' ELSE "from$_subquery$_006"."PORT_CODE" END )
9 - filter("VPT"."PORT"=CASE "VVV"."VEHICLE_SOURCE" WHEN 'SIA' THEN '020' ELSE "from$_subquery$_006"."PORT_CODE" END )
10 - access("VVVP"."VESSEL_PORT_ID"="VVV"."VESSEL_PORT_ID")
12 - access("VVP"."PORT_ID"="VVVP"."PORT_ID")
15 - access("VVV"."SOA_MODEL_CODE"="VPT"."MODEL_CODE" AND "VVV"."SOA_OPTION_CODE"="VPT"."OPTION_CODE" AND "VVV"."SOA_SPEC_CODE"="VPT"."SPEC_CODE" AND
"VVV"."SOA_EXT_COLOR_CODE"="VPT"."EXT_COLOR_CODE" AND "VVV"."SOA_INT_COLOR_CODE"="VPT"."INT_COLOR_CODE")
19 - access("VVV"."PS_STATUS"='NOT_MATCHED')
I am interested to know why the WINDOW SORT operation in step #6 is requiring so many buffer gets. I usually don't see that sort of thing for a WINDOW SORT operation. For example see the operation in step 21 of the same plan -- no additional buffer gets.
Does anyone know what these buffer gets are? I suspect that maybe the sort operation is spilling to disk, due to its size, and the extra buffer gets are to access those temp tablespace blocks. I'd like confirmation or alternate explanations, as appropriate. Thanks.
UPDATE
To hopefully clarify: I want to know why step 6 required added buffer gets beyond what was required to get through step 7. I.e., why it is not like the buffer gets in step 21, which did not increase the number from what was necessary to get through step 22.

Oracle "Total" plan cost is really less than some of it's elements

I cannot figure out why sometimes, the total cost of a plan can be a very small number whereas looking inside the plan we can find huge costs. (indeed the query is very slow).
Can somebody explain me that?
Here is an example.
Apparently the costful part comes from a field in the main select that does a listagg on a subview and the join condition with this subview contains a complex condition (we can join on one field or another).
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 875 | 20 |
| 1 | SORT GROUP BY | | 1 | 544 | |
| 2 | VIEW | | 1 | 544 | 3 |
| 3 | SORT UNIQUE | | 1 | 481 | 3 |
| 4 | NESTED LOOPS | | | | |
| 5 | NESTED LOOPS | | 3 | 1443 | 2 |
| 6 | TABLE ACCESS BY INDEX ROWID | | 7 | 140 | 1 |
| 7 | INDEX RANGE SCAN | | 7 | | 1 |
| 8 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 9 | TABLE ACCESS BY INDEX ROWID | | 1 | 461 | 1 |
| 10 | SORT GROUP BY | | 1 | 182 | |
| 11 | NESTED LOOPS | | | | |
| 12 | NESTED LOOPS | | 8 | 1456 | 3 |
| 13 | NESTED LOOPS | | 8 | 304 | 2 |
| 14 | TABLE ACCESS BY INDEX ROWID | | 7 | 154 | 1 |
| 15 | INDEX RANGE SCAN | | 7 | | 1 |
| 16 | INDEX RANGE SCAN | | 1 | 16 | 1 |
| 17 | INDEX RANGE SCAN | | 1 | | 1 |
| 18 | TABLE ACCESS BY INDEX ROWID | | 1 | 144 | 1 |
| 19 | SORT GROUP BY | | 1 | 268 | |
| 20 | VIEW | | 1 | 268 | 9 |
| 21 | SORT UNIQUE | | 1 | 108 | 9 |
| 22 | CONCATENATION | | | | |
| 23 | NESTED LOOPS | | | | |
| 24 | NESTED LOOPS | | 1 | 108 | 4 |
| 25 | NESTED LOOPS | | 1 | 79 | 3 |
| 26 | NESTED LOOPS | | 1 | 59 | 2 |
| 27 | TABLE ACCESS BY INDEX ROWID | | 1 | 16 | 1 |
| 28 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 29 | TABLE ACCESS BY INDEX ROWID | | 1 | 43 | 1 |
| 30 | INDEX RANGE SCAN | | 1 | | 1 |
| 31 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 32 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 33 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 34 | TABLE ACCESS BY INDEX ROWID | | 1 | 29 | 1 |
| 35 | NESTED LOOPS | | | | |
| 36 | NESTED LOOPS | | 1 | 108 | 4 |
| 37 | NESTED LOOPS | | 1 | 79 | 3 |
| 38 | NESTED LOOPS | | 1 | 59 | 2 |
| 39 | TABLE ACCESS BY INDEX ROWID | | 4 | 64 | 1 |
| 40 | INDEX RANGE SCAN | | 2 | | 1 |
| 41 | TABLE ACCESS BY INDEX ROWID | | 1 | 43 | 1 |
| 42 | INDEX RANGE SCAN | | 1 | | 1 |
| 43 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 44 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 45 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 46 | TABLE ACCESS BY INDEX ROWID | | 1 | 29 | 1 |
| 47 | SORT GROUP BY | | 1 | 330 | |
| 48 | VIEW | | 1 | 330 | 26695 |
| 49 | SORT UNIQUE | | 1 | 130 | 26695 |
| 50 | CONCATENATION | | | | |
| 51 | HASH JOIN ANTI | | 1 | 130 | 13347 |
| 52 | NESTED LOOPS | | | | |
| 53 | NESTED LOOPS | | 1 | 110 | 4 |
| 54 | NESTED LOOPS | | 1 | 81 | 3 |
| 55 | NESTED LOOPS | | 1 | 61 | 2 |
| 56 | TABLE ACCESS BY INDEX ROWID | | 1 | 16 | 1 |
| 57 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 58 | TABLE ACCESS BY INDEX ROWID | | 1 | 45 | 1 |
| 59 | INDEX RANGE SCAN | | 1 | | 1 |
| 60 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 61 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 62 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 63 | TABLE ACCESS BY INDEX ROWID | | 1 | 29 | 1 |
| 64 | VIEW | | 164K| 3220K| 13341 |
| 65 | NESTED LOOPS | | | | |
| 66 | NESTED LOOPS | | 164K| 11M| 13341 |
| 67 | NESTED LOOPS | | 164K| 8535K| 10041 |
| 68 | TABLE ACCESS BY INDEX ROWID | | 164K| 6924K| 8391 |
| 69 | INDEX SKIP SCAN | | 2131K| | 163 |
| 70 | INDEX UNIQUE SCAN | | 1 | 10 | 1 |
| 71 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 72 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 73 | HASH JOIN ANTI | | 2 | 260 | 13347 |
| 74 | NESTED LOOPS | | | | |
| 75 | NESTED LOOPS | | 2 | 220 | 4 |
| 76 | NESTED LOOPS | | 2 | 162 | 3 |
| 77 | NESTED LOOPS | | 2 | 122 | 2 |
| 78 | TABLE ACCESS BY INDEX ROWID | | 4 | 64 | 1 |
| 79 | INDEX RANGE SCAN | | 2 | | 1 |
| 80 | TABLE ACCESS BY INDEX ROWID | | 1 | 45 | 1 |
| 81 | INDEX RANGE SCAN | | 1 | | 1 |
| 82 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 83 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 84 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 85 | TABLE ACCESS BY INDEX ROWID | | 1 | 29 | 1 |
| 86 | VIEW | | 164K| 3220K| 13341 |
| 87 | NESTED LOOPS | | | | |
| 88 | NESTED LOOPS | | 164K| 11M| 13341 |
| 89 | NESTED LOOPS | | 164K| 8535K| 10041 |
| 90 | TABLE ACCESS BY INDEX ROWID | | 164K| 6924K| 8391 |
| 91 | INDEX SKIP SCAN | | 2131K| | 163 |
| 92 | INDEX UNIQUE SCAN | | 1 | 10 | 1 |
| 93 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 94 | TABLE ACCESS BY INDEX ROWID | | 1 | 20 | 1 |
| 95 | NESTED LOOPS OUTER | | 1 | 875 | 20 |
| 96 | NESTED LOOPS OUTER | | 1 | 846 | 19 |
| 97 | NESTED LOOPS OUTER | | 1 | 800 | 18 |
| 98 | NESTED LOOPS OUTER | | 1 | 776 | 17 |
| 99 | NESTED LOOPS OUTER | | 1 | 752 | 16 |
| 100 | NESTED LOOPS OUTER | | 1 | 641 | 15 |
| 101 | NESTED LOOPS OUTER | | 1 | 576 | 14 |
| 102 | NESTED LOOPS OUTER | | 1 | 554 | 13 |
| 103 | NESTED LOOPS OUTER | | 1 | 487 | 12 |
| 104 | NESTED LOOPS OUTER | | 1 | 434 | 11 |
| 105 | NESTED LOOPS | | 1 | 368 | 10 |
| 106 | NESTED LOOPS | | 1 | 102 | 9 |
| 107 | NESTED LOOPS OUTER | | 1 | 85 | 8 |
| 108 | NESTED LOOPS | | 1 | 68 | 7 |
| 109 | NESTED LOOPS | | 50 | 2700 | 6 |
| 110 | HASH JOIN | | 53 | 1696 | 5 |
| 111 | INLIST ITERATOR | | | | |
| 112 | TABLE ACCESS BY INDEX ROWID| | 520 | 10400 | 3 |
| 113 | INDEX RANGE SCAN | | 520 | | 1 |
| 114 | INLIST ITERATOR | | | | |
| 115 | TABLE ACCESS BY INDEX ROWID| | 91457 | 1071K| 1 |
| 116 | INDEX UNIQUE SCAN | | 2 | | 1 |
| 117 | TABLE ACCESS BY INDEX ROWID | | 1 | 22 | 1 |
| 118 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 119 | TABLE ACCESS BY INDEX ROWID | | 1 | 14 | 1 |
| 120 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 121 | TABLE ACCESS BY INDEX ROWID | | 1 | 17 | 1 |
| 122 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 123 | TABLE ACCESS BY INDEX ROWID | | 1 | 17 | 1 |
| 124 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 125 | TABLE ACCESS BY INDEX ROWID | | 1 | 266 | 1 |
| 126 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 127 | TABLE ACCESS BY INDEX ROWID | | 1 | 66 | 1 |
| 128 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 129 | TABLE ACCESS BY INDEX ROWID | | 1 | 53 | 1 |
| 130 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 131 | TABLE ACCESS BY INDEX ROWID | | 1 | 67 | 1 |
| 132 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 133 | INDEX RANGE SCAN | | 1 | 22 | 1 |
| 134 | TABLE ACCESS BY INDEX ROWID | | 1 | 65 | 1 |
| 135 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 136 | TABLE ACCESS BY INDEX ROWID | | 1 | 111 | 1 |
| 137 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 138 | TABLE ACCESS BY INDEX ROWID | | 1 | 24 | 1 |
| 139 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 140 | TABLE ACCESS BY INDEX ROWID | | 1 | 24 | 1 |
| 141 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 142 | TABLE ACCESS BY INDEX ROWID | | 1 | 46 | 1 |
| 143 | INDEX UNIQUE SCAN | | 1 | | 1 |
| 144 | TABLE ACCESS BY INDEX ROWID | | 1 | 29 | 1 |
| 145 | INDEX UNIQUE SCAN | | 1 | | 1 |
----------------------------------------------------------------------------------------------------------

The total cost of a statement is usually equal to or greater than the cost of any of its child operations. There are at least 4 exceptions to this rule.
Your plan looks like #3 but we can't be sure without looking at code.
1. FILTER
Execution plans may depend on conditions at run-time. These conditions cause FILTER operations that will dynamically decide which query block to execute. The example below uses a static condition but still demonstrates the concept. Part of the subquery is very expensive but the condition negates the whole thing.
explain plan for select * from dba_objects cross join dba_objects where 1 = 2;
select * from table(dbms_xplan.display(format => 'basic +cost'));
Plan hash value: 3258663795
--------------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 0 (0)|
| 1 | FILTER | | |
| 2 | MERGE JOIN CARTESIAN | | 11M (3)|
...
2. COUNT STOPKEY
Execution plans sum child operations up until the final cost. But child operations will not always finish. In the example below it may be correct to say that part of the plan costs 214. But because of the condition where rownum <= 1 only part of that child operation may run.
explain plan for
select /*+ no_query_transformation */ *
from (select * from dba_objects join dba_objects using (owner))
where rownum <= 1;
select * from table(dbms_xplan.display(format => 'basic +cost'));
Plan hash value: 2132093199
-------------------------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 (0)|
| 1 | COUNT STOPKEY | | |
| 2 | VIEW | | 4 (0)|
| 3 | VIEW | | 4 (0)|
| 4 | NESTED LOOPS | | 4 (0)|
| 5 | VIEW | DBA_OBJECTS | 2 (0)|
| 6 | UNION-ALL | | |
| 7 | HASH JOIN | | 3 (34)|
| 8 | INDEX FULL SCAN | I_USER2 | 1 (0)|
| 9 | VIEW | _CURRENT_EDITION_OBJ | 1 (0)|
| 10 | FILTER | | |
| 11 | HASH JOIN | | 214 (3)|
...
3. Subqueries in the SELECT column list
Cost aggregation does not include subqueries in the SELECT column list. A query like select ([expensive query]) from dual; will have a very small total cost. I don't understand the reason for this; Oracle estimates the subquery and he number of rows in the FROM, surely it could multiply them together for a total cost.
explain plan for
select dummy,(select count(*) from dba_objects cross join dba_objects) from dual;
select * from table(dbms_xplan.display(format => 'basic +cost'));
Plan hash value: 3705842531
---------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
---------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 (0)|
| 1 | SORT AGGREGATE | | |
| 2 | MERGE JOIN CARTESIAN | | 11M (3)|
...
4. Other - rounding? bugs?
About 0.01% of plans still have unexplainable cost issues. I can't find any pattern among them. Perhaps it's just a rounding issue or some rare optimizer bugs. There will always be some weird cases with a any model as complicated as the optimizer.
Check for more exceptions
This query can find other exceptions, it returns all plans where the first cost is less than the maximum cost.
select *
from
(
--First and Max cost per plan.
select
sql_id, plan_hash_value, id, cost
,max(cost) keep (dense_rank first order by id)
over (partition by sql_id, plan_hash_value) first_cost
,max(cost)
over (partition by sql_id, plan_hash_value) max_cost
,max(case when operation = 'COUNT' and options = 'STOPKEY' then 1 else 0 end)
over (partition by sql_id, plan_hash_value) has_count_stopkey
,max(case when operation = 'FILTER' and options is null then 1 else 0 end)
over (partition by sql_id, plan_hash_value) has_filter
,count(distinct(plan_hash_value))
over () total_plans
from v$sql_plan
--where sql_id = '61a161nm1ttjj'
order by 1,2,3
)
where first_cost < max_cost
--It's easy to exclude FILTER and COUNT STOPKEY.
and has_filter = 0
and has_count_stopkey = 0
order by 1,2,3;

SSRS 2005 Matrix Rows not grouping

I have a problem, The rows are not grouping properly and I am not sure if it is a dataset problem or reporting problem. In the example below how do I get the 'S003' rows to show in one row? Is there some grouping property not set correctly? This report is off the reporting wizard + some formating changes adn drill down. Using VS BI 2005.
Dataset
Year | Month | Cust | Item | Shipto | SaleCases | RegCases
2011 |||| 1 |||| DEM ||| B123 ||| S000 | | | | | 0 | | | | | | 54
2011 |||| 1 |||| DEM ||| B123 ||| S001 | | | | | 0 | | | | | | 54
2011 |||| 1 |||| DEM ||| B123 ||| S002 | | | | | 0 | | | | | | 54
2011 |||| 1 |||| DEM ||| B123 ||| S003 | | | | | 0 | | | | | | 54
2010 |||| 1 |||| DEM ||| B123 ||| S003 | | | 754 | | | | | | 0
Report
| | | | | | | | | | | | | | | | | | | | | | | | 2010 | | | | | 2011
| | | | | | | | | | | | | | | | | | | | | | | | | | | 1 | | | | | | | | 1
| | | | | | | | | | | | | | | | | | | | Sale | Reg || Sale |Reg
DEM | | B123 | S000 | | | | | | | | | | | | | | 0 | | 54
| | | | | | | | | | | | | S001 | | | | | | | | | | | | | | 0 | | 54
| | | | | | | | | | | | | S002 | | | | | | | | | | | | | | 0 | | 54
| | | | | | | | | | | | | S003 | | | | | | | | | | | | | | 0 | | 54
DEM | | B123 | S003 | | 754 | | 0 | | | | | | | | | | | |
Why is it creating a new row/group for the last line and not attaching it to the third row? The only difference is the year.

Bah.. White space differences in shiptos. RTRIM() Fixed it.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Slow aggregation on big neo4j graph - performance

Related

Pivot Table in Hive and Create Multiple Columns for Unique Combinations

Mariadb 2 explains plan : with Using join buffer and without

Buffers used by WINDOW SORT operation

Oracle "Total" plan cost is really less than some of it's elements

SSRS 2005 Matrix Rows not grouping

Categories

Resources