Slow counts finally solved - phoenix-framework

tl;dr: |> Repo.aggregate(:count, :id) is slow, do use |> Repo.aggregate(:count)
I'm running a podcast database with > 5 million episodes. After storing a new episode, I count the episodes for a given podcast for a counter cache like this:
episodes_count = where(Episode, podcast_id: ^podcast_id)
|> Repo.aggregate(:count, :id)
This turned out to get slower and slower. So I started to dig deeper, and I realized, that
in Postgres 12 only SELECT COUNT(*) does an index only scan, while SELECT COUNT(e0.id) doesn't.
For a cold database (just restarted) even the first index scan is reasonably fast:
postgres=# \c pan_prod
Sie sind jetzt verbunden mit der Datenbank »pan_prod« als Benutzer »postgres«.
pan_prod=# EXPLAIN ANALYZE SELECT count(*) FROM "episodes" AS e0 WHERE (e0."podcast_id" = 35202);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=348.51..348.52 rows=1 width=8) (actual time=15.823..15.823 rows=1 loops=1)
-> Index Only Scan using episodes_podcast_id_index on episodes e0 (cost=0.43..323.00 rows=10204 width=0) (actual time=1.331..14.832 rows=10613 loops=1)
Index Cond: (podcast_id = 35202)
Heap Fetches: 0
Planning Time: 2.994 ms
Execution Time: 16.017 ms
It even gets faster for the second scan:
pan_prod=# EXPLAIN ANALYZE SELECT count(*) FROM "episodes" AS e0 WHERE (e0."podcast_id" = 35202);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=348.51..348.52 rows=1 width=8) (actual time=5.007..5.008 rows=1 loops=1)
-> Index Only Scan using episodes_podcast_id_index on episodes e0 (cost=0.43..323.00 rows=10204 width=0) (actual time=0.042..3.548 rows=10613 loops=1)
Index Cond: (podcast_id = 35202)
Heap Fetches: 0
Planning Time: 0.304 ms
Execution Time: 5.074 ms
While the first bitmap heap scan ist terribly slow:
pan_prod=# EXPLAIN ANALYZE SELECT count(e0.id) FROM "episodes" AS e0 WHERE (e0."podcast_id" = 35202);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=37181.71..37181.72 rows=1 width=8) (actual time=4098.525..4098.526 rows=1 loops=1)
-> Bitmap Heap Scan on episodes e0 (cost=219.51..37156.20 rows=10204 width=4) (actual time=6.508..4082.558 rows=10613 loops=1)
Recheck Cond: (podcast_id = 35202)
Heap Blocks: exact=6516
-> Bitmap Index Scan on episodes_podcast_id_index (cost=0.00..216.96 rows=10204 width=0) (actual time=3.657..3.658 rows=10613 loops=1)
Index Cond: (podcast_id = 35202)
Planning Time: 0.412 ms
Execution Time: 4098.719 ms
The second one is typically faster:
pan_prod=# EXPLAIN ANALYZE SELECT count(e0.id) FROM "episodes" AS e0 WHERE (e0."podcast_id" = 35202);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=37181.71..37181.72 rows=1 width=8) (actual time=18.857..18.857 rows=1 loops=1)
-> Bitmap Heap Scan on episodes e0 (cost=219.51..37156.20 rows=10204 width=4) (actual time=6.047..17.152 rows=10613 loops=1)
Recheck Cond: (podcast_id = 35202)
Heap Blocks: exact=6516
-> Bitmap Index Scan on episodes_podcast_id_index (cost=0.00..216.96 rows=10204 width=0) (actual time=3.738..3.738 rows=10613 loops=1)
Index Cond: (podcast_id = 35202)
Planning Time: 0.322 ms
Execution Time: 18.999 ms
I don't get, why SELECT count(e0.id) doesn't use the index, and I would like to know why.
I always thought, I should prefer it, as only one column is looked at, but that is not the case, it seems.

I don't get, why SELECT count(e0.id) doesn't use the index, and I would like to know why. I always thought, I should prefer it, as only one column is looked at, but that is not the case, it seems.
It does use the index episodes_podcast_id_index both times.
Once it can do without additionally looking in the table, the other time it does look into the table.
You didn't provide the index defintion, but it seems like id is not part of it. I would also supect that it is not a not nullable column (it might contains null values). Thus, the DB has to fetch the id column to see if it is null.
Why?
Because count(*) counts rows while count(<expression>) counts not-null values.
See: https://modern-sql.com/concept/null#aggregates

Related

PostgreSQL 9.6 performance issue due to sequential scan insted of index

I am encountering a performance issue with PostgreSQL 9.6.
I have a table:
CREATE TABLE public.cdrsinfo (
recordid bigint NOT NULL,
billid integer,
CONSTRAINT cdrsinfo_pkey PRIMARY KEY (recordid) );
And a index created on it like:
CREATE UNIQUE INDEX cdrsinfo_pkey ON cdrsinfo USING btree (recordid)
CREATE INDEX indx_cdrsinfo_billid ON cdrsinfo USING btree (billid)
The table has around 3M records, I've done the analyze on it and when running the following queries with EXPLAIN I get some strange results :
SELECT max(recordid) FROM example WHERE billid = 535;
Result (cost=26631.27..26631.28 rows=1 width=8)
InitPlan 1 (returns $0)
-> Limit (cost=0.57..26631.27 rows=1 width=8)
-> Index Scan Backward using cdrsinfo_pkey on cdrsinfo (cost=0.57..2291944283.82 rows=86064 width=8)
Index Cond: (recordid IS NOT NULL)
Filter: (billid = 535)
SELECT max(recordid) FROM example WHERE billid < 535;
Aggregate (cost=725.85..725.86 rows=1 width=8)
-> Index Scan using indx_cdrsinfo_billid on cdrsinfo (cost=0.57..725.37 rows=192 width=8)
Index Cond: (billid < 535)
If I do a count of all the rows that have the billid = 535 I get 44 . My question is why doesn't the query planner use the indx_cdrsinfo_billid in the first example ?
I get huge performance issues because of this, this first SQL takes ~2 hours to complete and the second one ~170 ms .
I forgot to mention a 3rd index that I have on the table :
CREATE INDEX indx_cdrsinfo_billid_recordid ON cdrsinfo USING btree (recordid,billid)
As I mentioned the table was analyzed before running the query . Now when I executed the explain with analyze verbose and buffers I get a very good time on the same query where billid = 535 :
Result (cost=0.85..0.86 rows=1 width=8) (actual time=0.034..0.034 rows=1 loops=1)
Output: $0
Buffers: shared hit=5
InitPlan 1 (returns $0)
-> Limit (cost=0.57..0.85 rows=1 width=8) (actual time=0.031..0.031 rows=1 loops=1)
Output: cdrsinfo.recordid
Buffers: shared hit=5
-> Index Only Scan Backward using indx_cdrsinfo_billid_recordid on public.cdrsinfo (cost=0.57..24041.88 rows=89007 width=8) (actual time=0.022..0.022 rows=1 loops=1)
Output: cdrsinfo.recordid
Index Cond: ((cdrsinfo.billid = 535) AND (cdrsinfo.recordid IS NOT NULL))
Heap Fetches: 0
Buffers: shared hit=5
Planning time: 0.177 ms
Execution time: 0.056 ms
The index was there in the past too, I don't understand why the Query Planner decided to use it now and not this morning .
Another strange thing when I got that very awful execution time I tried to re-write that query in several modes to see the explain plan and when I do it like this :
SELECT max(recordid) FROM cdrsinfo WHERE recordid IN (SELECT recordid FROM cdrsinfo WHERE billid = 535 OFFSET 0)
Because of the OFFSET 0 the Query Planner used the indx_cdrsinfo_billid, the one that I would expect to be used .

PostgreSQL: query runs very slowly on first run, is fast on subsequent runs

I have a query that's running particularly slowly when first run against a particular customer's account. It then runs substantially more quickly on all subsequent runs. This issue is extremely pronounced on a spinning disc - it's still an issue, though much less annoying, on an SSD.
This is an explain of the first time the query runs: http://explain.depesz.com/s/NdTV
Aggregate (cost=1056.860..1056.870 rows=1 width=0) (actual time=356.740..356.740 rows=1 loops=1)
Output: five_two(*)
Buffers: shared hit=331 read=508
-> Nested Loop (cost=935.080..1056.860 rows=1 width=0) (actual time=292.458..356.712 rows=71 loops=1)
Buffers: shared hit=331 read=508
-> Nested Loop (cost=935.080..1051.990 rows=1 width=4) (actual time=292.440..356.116 rows=71 loops=1)
Output: xray_oscar.zulu_lima
Buffers: shared hit=136 read=490
-> HashAggregate (cost=935.080..935.090 rows=1 width=4) (actual time=282.669..282.673 rows=8 loops=1)
Output: foxtrot_echo.quebec
Buffers: shared hit=80 read=458
-> Sort (cost=935.060..935.070 rows=1 width=8) (actual time=282.661..282.662 rows=8 loops=1)
Output: foxtrot_echo.quebec, foxtrot_echo.five_echo
Sort Key: foxtrot_echo.five_echo
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=80 read=458
-> Nested Loop (cost=0.000..935.050 rows=1 width=8) (actual time=110.522..282.639 rows=8 loops=1)
Output: foxtrot_echo.quebec, foxtrot_echo.five_echo
Buffers: shared hit=80 read=458
-> Seq Scan on charlie_hotel_yankee (cost=0.000..870.560 rows=6 width=4) (actual time=1.880..5.777 rows=35 loops=1)
Output: seven_charlie.quebec, seven_charlie.foxtrot_mike, seven_charlie."foxtrot_india", seven_charlie.yankee, seven_charlie.seven_papa, seven_charlie.romeo_hotel, seven_charlie.foxtrot_two, seven_charlie.golf, seven_charlie.kilo_seven, seven_charlie.four
Filter: ((NOT seven_charlie.golf) AND (seven_charlie.yankee = 14732))
Rows Removed by Filter: 38090
Buffers: shared hit=2 read=392
-> Index Scan using xray_bravo on delta_charlie (cost=0.000..10.740 rows=1 width=12) (actual time=7.908..7.908 rows=0 loops=35)
Output: foxtrot_echo.quebec, foxtrot_echo.uniform, foxtrot_echo.seven_papa, foxtrot_echo.romeo_hotel, foxtrot_echo.five_echo
Index Cond: (foxtrot_echo.uniform = seven_charlie.quebec)
Filter: ((foxtrot_echo.five_echo >= 'three'::date) AND (foxtrot_echo.five_echo <= 'november_foxtrot_golf'::date))
Rows Removed by Filter: 7
Buffers: shared hit=78 read=66
-> Index Scan using whiskey_papa on xray_india (cost=0.000..116.890 rows=1 width=8) (actual time=1.267..9.176 rows=9 loops=8)
Output: xray_oscar.quebec, xray_oscar.foxtrot_mike, xray_oscar.foxtrot_tango, xray_oscar.seven_papa, xray_oscar.romeo_hotel, xray_oscar.zulu_lima, xray_oscar.lima_five, xray_oscar.bravo, xray_oscar.mike, xray_oscar.papa_hotel, xray_oscar.papa_romeo, delta_lima (...)
Index Cond: (xray_oscar.lima_five = foxtrot_echo.quebec)
Filter: ((xray_oscar.charlie_hotel_foxtrot & 1) = 0)
Buffers: shared hit=56 read=32
-> Index Scan using kilo_whiskey on seven_echo (cost=0.000..4.860 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=71)
Output: lima_november.quebec, lima_november.seven_papa, lima_november.romeo_hotel, lima_november.victor, lima_november.november_foxtrot_victor, lima_november.romeo_charlie, lima_november.whiskey_lima, lima_november.hotel, lima_november.sierra, lima_november.alpha, lima_november.zulu_yankee (...)
Index Cond: (lima_november.quebec = xray_oscar.zulu_lima)
Filter: (lima_november.yankee = 14732)
Buffers: shared hit=195 read=18
And here's the subsequent, fast runs: http://explain.depesz.com/s/K1J5
Aggregate (cost=1056.860..1056.870 rows=1 width=0) (actual time=5.783..5.783 rows=1 loops=1)
Output: five_two(*)
Buffers: shared hit=366 read=473
-> Nested Loop (cost=935.080..1056.860 rows=1 width=0) (actual time=5.516..5.770 rows=71 loops=1)
Buffers: shared hit=366 read=473
-> Nested Loop (cost=935.080..1051.990 rows=1 width=4) (actual time=5.505..5.622 rows=71 loops=1)
Output: xray_oscar.zulu_lima
Buffers: shared hit=156 read=470
-> HashAggregate (cost=935.080..935.090 rows=1 width=4) (actual time=5.496..5.496 rows=8 loops=1)
Output: foxtrot_echo.quebec
Buffers: shared hit=85 read=453
-> Sort (cost=935.060..935.070 rows=1 width=8) (actual time=5.490..5.491 rows=8 loops=1)
Output: foxtrot_echo.quebec, foxtrot_echo.five_echo
Sort Key: foxtrot_echo.five_echo
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=85 read=453
-> Nested Loop (cost=0.000..935.050 rows=1 width=8) (actual time=3.565..5.480 rows=8 loops=1)
Output: foxtrot_echo.quebec, foxtrot_echo.five_echo
Buffers: shared hit=85 read=453
-> Seq Scan on charlie_hotel_yankee (cost=0.000..870.560 rows=6 width=4) (actual time=1.865..5.170 rows=35 loops=1)
Output: seven_charlie.quebec, seven_charlie.foxtrot_mike, seven_charlie."foxtrot_india", seven_charlie.yankee, seven_charlie.seven_papa, seven_charlie.romeo_hotel, seven_charlie.foxtrot_two, seven_charlie.golf, seven_charlie.kilo_seven, seven_charlie.four
Filter: ((NOT seven_charlie.golf) AND (seven_charlie.yankee = 14732))
Rows Removed by Filter: 38090
Buffers: shared hit=1 read=393
-> Index Scan using xray_bravo on delta_charlie (cost=0.000..10.740 rows=1 width=12) (actual time=0.008..0.008 rows=0 loops=35)
Output: foxtrot_echo.quebec, foxtrot_echo.uniform, foxtrot_echo.seven_papa, foxtrot_echo.romeo_hotel, foxtrot_echo.five_echo
Index Cond: (foxtrot_echo.uniform = seven_charlie.quebec)
Filter: ((foxtrot_echo.five_echo >= 'three'::date) AND (foxtrot_echo.five_echo <= 'november_foxtrot_golf'::date))
Rows Removed by Filter: 7
Buffers: shared hit=84 read=60
-> Index Scan using whiskey_papa on xray_india (cost=0.000..116.890 rows=1 width=8) (actual time=0.004..0.014 rows=9 loops=8)
Output: xray_oscar.quebec, xray_oscar.foxtrot_mike, xray_oscar.foxtrot_tango, xray_oscar.seven_papa, xray_oscar.romeo_hotel, xray_oscar.zulu_lima, xray_oscar.lima_five, xray_oscar.bravo, xray_oscar.mike, xray_oscar.papa_hotel, xray_oscar.papa_romeo, delta_lima (...)
Index Cond: (xray_oscar.lima_five = foxtrot_echo.quebec)
Filter: ((xray_oscar.charlie_hotel_foxtrot & 1) = 0)
Buffers: shared hit=71 read=17
-> Index Scan using kilo_whiskey on seven_echo (cost=0.000..4.860 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=71)
Output: lima_november.quebec, lima_november.seven_papa, lima_november.romeo_hotel, lima_november.victor, lima_november.november_foxtrot_victor, lima_november.romeo_charlie, lima_november.whiskey_lima, lima_november.hotel, lima_november.sierra, lima_november.alpha, lima_november.zulu_yankee (...)
Index Cond: (lima_november.quebec = xray_oscar.zulu_lima)
Filter: (lima_november.yankee = 14732)
Buffers: shared hit=210 read=3
I know that Postgres is always going to be a little faster on subsequent queries compared to the first time it runs one, but I didn't think it would be this substantial a difference. Using other customer ID filters (lima_november.yankee in the query) I've seen one query run fast in 12ms and slow in 1873ms. So it's a pretty profound difference.
The query is selecting a count from a table with around 2 million rows, based on three filters. The first is a customer ID, done through a join, which is the index scan at the bottom of the query plans. The second is an IN query, which seems to be what index scan whiskey_papa is doing. The third is a bitwise statement - (xray_oscar.charlie_hotel_foxtrot & 1) = 0.
I haven't seen the buffers output of explain analyze before so don't really know where to start reading it. It seems odd that the Nested Loop at position 6 in the slow query has actual times so different to the lines nested inside it. But I don't really know how to go about improving that (or if that's even the main problem).
These tests are on postgres 9.2.13 on OS X 10.11.1. Any tips?

Queries running very slow on first load on PostgreSQL

We are using a PostgreSQL version 9.4 database on Amazon EC2. All of our queries run super slow on the first try, until it gets cached after that they are quite quick however it is not a conciliation as it slows down the page load.
Here in one of the queries we use:
SELECT HE.fs_perm_sec_id,
HE.TICKER_EXCHANGE,
HE.proper_name,
OP.shares_outstanding,
(SELECT factset_industry_desc
FROM factset_industry_map AS fim
WHERE fim.factset_industry_code = HES.industry_code) AS industry,
(SELECT SUM(POSITION) AS ST_HOLDINGS
FROM OWN_STAKES_HOLDINGS S
WHERE S.POSITION > 0
AND S.fs_perm_sec_id = HE.fs_perm_sec_id
GROUP BY FS_PERM_SEC_ID) AS stake_holdings,
(SELECT SUM(CURRENT_HOLDINGS)
FROM
(SELECT CURRENT_HOLDINGS
FROM OWN_INST_HOLDINGS IHT
WHERE FS_PERM_SEC_ID=HE.FS_PERM_SEC_ID
ORDER BY CURRENT_HOLDINGS DESC LIMIT 10)A) AS top_10_inst_hodings,
(SELECT SUM(OIH.current_holdings)
FROM own_inst_holdings OIH
WHERE OIH.fs_perm_sec_id = HE.fs_perm_sec_id) AS inst_holdings
FROM own_prices OP
JOIN h_security_ticker_exchange HE ON OP.fs_perm_sec_id = HE.fs_perm_sec_id
JOIN h_entity_sector HES ON HES.factset_entity_id = HE.factset_entity_id
WHERE HE.ticker_exchange = 'PG-NYS'
ORDER BY OP.price_date DESC LIMIT 1
Ran an EXPLAIN ANALYSE and received the following results:
QUERY PLAN
Limit (cost=223.39..223.39 rows=1 width=100) (actual time=2420.644..2420.645 rows=1 loops=1)
-> Sort (cost=223.39..223.39 rows=1 width=100) (actual time=2420.643..2420.643 rows=1 loops=1)
Sort Key: op.price_date
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.26..223.39 rows=1 width=100) (actual time=2316.169..2420.566 rows=36 loops=1)
-> Nested Loop (cost=0.17..8.87 rows=1 width=104) (actual time=3.958..5.084 rows=36 loops=1)
-> Index Scan using h_sec_exch_factset_entity_id_idx on h_security_ticker_exchange he (cost=0.09..4.09 rows=1 width=92) (actual time=1.452..1.454 rows=1 loops=1)
Index Cond: ((ticker_exchange)::text = 'PG-NYS'::text)
-> Index Scan using alex_prices on own_prices op (cost=0.09..4.68 rows=33 width=23) (actual time=2.496..3.592 rows=36 loops=1)
Index Cond: ((fs_perm_sec_id)::text = (he.fs_perm_sec_id)::text)
-> Index Scan using alex_factset_entity_idx on h_entity_sector hes (cost=0.09..4.09 rows=1 width=14) (actual time=0.076..0.077 rows=1 loops=36)
Index Cond: (factset_entity_id = he.factset_entity_id)
SubPlan 1
-> Index Only Scan using alex_factset_industry_code_idx on factset_industry_map fim (cost=0.03..2.03 rows=1 width=20) (actual time=0.006..0.007 rows=1 loops=36)
Index Cond: (factset_industry_code = hes.industry_code)
Heap Fetches: 0
SubPlan 2
-> GroupAggregate (cost=0.08..2.18 rows=2 width=17) (actual time=0.735..0.735 rows=1 loops=36)
Group Key: s.fs_perm_sec_id
-> Index Only Scan using own_stakes_holdings_perm_position_idx on own_stakes_holdings s (cost=0.08..2.15 rows=14 width=17) (actual time=0.080..0.713 rows=39 loops=36)
Index Cond: ((fs_perm_sec_id = (he.fs_perm_sec_id)::text) AND (\position\ > 0::numeric))
Heap Fetches: 1155
SubPlan 3
-> Aggregate (cost=11.25..11.26 rows=1 width=6) (actual time=0.166..0.166 rows=1 loops=36)
-> Limit (cost=0.09..11.22 rows=10 width=6) (actual time=0.081..0.150 rows=10 loops=36)
-> Index Only Scan Backward using alex_current_holdings_idx on own_inst_holdings iht (cost=0.09..194.87 rows=175 width=6) (actual time=0.080..0.147 rows=10 loops=36)
Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
Heap Fetches: 288
SubPlan 4
-> Aggregate (cost=194.96..194.96 rows=1 width=6) (actual time=66.102..66.102 rows=1 loops=36)
-> Index Only Scan using alex_current_holdings_idx on own_inst_holdings oih (cost=0.09..194.87 rows=175 width=6) (actual time=0.060..65.209 rows=2505 loops=36)
Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
Heap Fetches: 33453
Planning time: 1.581 ms
Execution time: 2420.830 ms
Once we disable the SELECT SUM() for the 3 aggregates it speeds up considerably but it defeats the point of having a relational DB.
We are running the queries with NodeJS using PG plugin (https://www.npmjs.com/package/pg) to connect and run the queries on the DB
How can we speed up the queries? What additional steps we could take? We have already indexed the DB and all the fields seems to be indexed properly but still it not fast enough.
Any help, comments and /or suggestions are appreciated.
Nested loops with aggregates are generally a bad thing. The below should avoid that. (Untested; a SQLFiddle would have been helpful.) Give this a spin and let me know. I'm curious how the engine plays with the window function filter.
WITH security
AS (
SELECT HE.fs_perm_sec_id
, HE.TICKER_EXCHANGE
, HE.proper_name
, OP.shares_outstanding
, OP.price_date
FROM own_prices AS OP
JOIN h_security_ticker_exchange AS HE
ON OP.fs_perm_sec_id = HE.fs_perm_sec_id
JOIN h_entity_sector AS HES
ON HES.factset_entity_id = HE.factset_entity_id
WHERE HE.ticker_exchange = 'PG-NYS'
)
SELECT SE.fs_perm_sec_id
, SE.TICKER_EXCHANGE
, SE.proper_name
, SE.shares_outstanding
, S.stake_holdings
, IHT.top_10_inst_holdings
, OIH.inst_holdings
FROM security SE
JOIN (
SELECT S.fs_perm_sec_id
, SUM(S.POSITION) AS stake_holdings
FROM OWN_STAKES_HOLDINGS AS S
WHERE S.fs_perm_sec_id IN (
SELECT fs_perm_sec_id
FROM security
)
AND S.POSITION > 0
GROUP BY S.fs_perm_sec_id
) AS S
ON SE.fs_perm_sec_id = S.fs_perm_sec_id
JOIN (
SELECT IHT.FS_PERM_SEC_ID
, SUM(IHT.CURRENT_HOLDINGS) AS top_10_inst_holdings
FROM OWN_INST_HOLDINGS AS IHT
WHERE IHT.FS_PERM_SEC_ID IN (
SELECT fs_perm_sec_id
FROM security
)
AND ROW_NUMBER() OVER (
PARTITION BY IHT.FS_PERM_SEC_ID
ORDER BY IHT.CURRENT_HOLDINGS DESC
) <= 10
GROUP BY IHT.FS_PERM_SEC_ID
) AS IHT
ON SE.fs_perm_sec_id = IHT.fs_perm_sec_id
JOIN (
SELECT S.fs_perm_sec_id
, SUM(OIH.current_holdings) AS inst_holdings
FROM own_inst_holdings AS OIH
WHERE OIH.fs_perm_sec_id IN (
SELECT fs_perm_sec_id
FROM security
)
GROUP BY OIH.fs_perm_sec_id
) AS OIH
ON SE.fs_perm_sec_id = OIH.fs_perm_sec_id
ORDER BY SE.price_date
LIMIT 1

Slow performance after upgrading PostreSQL from 9.1 to 9.4

I'm getting extremely slow performance after upgrading Postgres 9.1 to 9.4. Here's an example of a two queries which are running significantly more slowly.
Note: I realize that these queries might be able to be rewritten to work more efficiently, however the main thing I'm concerned about is that after upgrading to a newer version of Postgres, they are suddenly running 100x more slowly! I'm hoping there's a configuration variable someplace I've overlooked.
While doing the upgrade I used the pg_upgrade command with the --link option. The configuration file is the same between 9.4 and 9.1. It's not running on the exact same hardware, but they're both running on a Linode and I've tried using 3 different Linodes now for the new server, so I don't think this is a hardware issue.
It seems like in both cases, 9.4 is using different indexes than 9.1?
9.1:
EXPLAIN ANALYZE SELECT "id", "title", "timestamp", "parent", "deleted", "sunk", "closed", "sticky", "lastupdate", "views", "oldid", "editedon", "devpost", "hideblue", "totalvotes", "statustag", "forum_category_id", "account_id" FROM "forum_posts" WHERE "parent" = 882269 ORDER BY "timestamp" DESC LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=63.87..63.87 rows=1 width=78) (actual time=0.020..0.020 rows=0 loops=1)
-> Sort (cost=63.87..63.98 rows=45 width=78) (actual time=0.018..0.018 rows=0 loops=1)
Sort Key: "timestamp"
Sort Method: quicksort Memory: 17kB
-> Index Scan using index_forum_posts_parent on forum_posts (cost=0.00..63.65 rows=45 width=78) (actual time=0.013..0.013 rows=0 loops=1)
Index Cond: (parent = 882269)
Total runtime: 0.074 ms
(7 rows)
9.4:
EXPLAIN ANALYZE SELECT "id", "title", "timestamp", "parent", "deleted", "sunk", "closed", "sticky", "lastupdate", "views", "oldid", "editedon", "devpost", "hideblue", "totalvotes", "statustag", "forum_category_id", "account_id" FROM "forum_posts" WHERE "parent" = 882269 ORDER BY "timestamp" DESC LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..63.48 rows=1 width=1078) (actual time=920.484..920.484 rows=0 loops=1)
-> Index Scan Backward using forum_posts_timestamp_index on forum_posts (cost=0.42..182622.07 rows=2896 width=1078) (actual time=920.480..920.480 rows=0 loops=1)
Filter: (parent = 882269)
Rows Removed by Filter: 1576382
Planning time: 0.166 ms
Execution time: 920.521 ms
(6 rows)
9.1:
EXPLAIN ANALYZE SELECT "user_library_images"."id", "user_library_images"."imgsrc", "user_library_images"."library_image_id", "user_library_images"."type", "user_library_images"."is_user_uploaded", "user_library_images"."credit", "user_library_images"."orig_dimensions", "user_library_images"."account_id" FROM "user_library_images" INNER JOIN "image_tags" ON "user_library_images"."id" = "image_tags"."user_library_image_id" WHERE ("user_library_images"."account_id" = 769718 AND "image_tags"."tag" ILIKE '%stone%') GROUP BY "user_library_images"."id", "user_library_images"."imgsrc", "user_library_images"."library_image_id", "user_library_images"."type", "user_library_images"."is_user_uploaded", "user_library_images"."credit", "user_library_images"."orig_dimensions", "user_library_images"."account_id" ORDER BY "user_library_images"."id";
Group (cost=2015.46..2015.49 rows=1 width=247) (actual time=0.629..0.652 rows=6 loops=1)
-> Sort (cost=2015.46..2015.47 rows=1 width=247) (actual time=0.626..0.632 rows=6 loops=1)
Sort Key: user_library_images.id, user_library_images.imgsrc, user_library_images.library_image_id, user_library_images.type, user_library_images.is_user_uploaded, user_library_images.credit, user_library_images.orig_dimensions, user_library_images.account_id
Sort Method: quicksort Memory: 19kB
-> Nested Loop (cost=0.00..2015.45 rows=1 width=247) (actual time=0.283..0.603 rows=6 loops=1)
-> Index Scan using index_user_library_images_account on user_library_images (cost=0.00..445.57 rows=285 width=247) (actual time=0.076..0.273 rows=13 loops=1)
Index Cond: (account_id = 769718)
-> Index Scan using index_image_tags_user_library_image on image_tags (cost=0.00..5.50 rows=1 width=4) (actual time=0.020..0.021 rows=0 loops=13)
Index Cond: (user_library_image_id = user_library_images.id)
Filter: (tag ~~* '%stone%'::text)
Total runtime: 0.697 ms
(11 rows)
9.4:
Group (cost=166708.13..166709.46 rows=59 width=1241) (actual time=9677.052..9677.052 rows=0 loops=1)
Group Key: user_library_images.id, user_library_images.imgsrc, user_library_images.library_image_id, user_library_images.type, user_library_images.is_user_uploaded, user_library_images.credit, user_library_images.orig_dimensions, user_library_images.account_id
-> Sort (cost=166708.13..166708.28 rows=59 width=1241) (actual time=9677.049..9677.049 rows=0 loops=1)
Sort Key: user_library_images.id, user_library_images.imgsrc, user_library_images.library_image_id, user_library_images.type, user_library_images.is_user_uploaded, user_library_images.credit, user_library_images.orig_dimensions, user_library_images.account_id
Sort Method: quicksort Memory: 17kB
-> Hash Join (cost=10113.22..166706.39 rows=59 width=1241) (actual time=9677.035..9677.035 rows=0 loops=1)
Hash Cond: (image_tags.user_library_image_id = user_library_images.id)
-> Seq Scan on image_tags (cost=0.00..156488.85 rows=11855 width=4) (actual time=0.301..9592.048 rows=63868 loops=1)
Filter: (tag ~~* '%stone%'::text)
Rows Removed by Filter: 9370406
-> Hash (cost=10045.97..10045.97 rows=5380 width=1241) (actual time=0.047..0.047 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Bitmap Heap Scan on user_library_images (cost=288.12..10045.97 rows=5380 width=1241) (actual time=0.027..0.037 rows=4 loops=1)
Recheck Cond: (account_id = 769718)
Heap Blocks: exact=4
-> Bitmap Index Scan on index_user_library_images_account (cost=0.00..286.78 rows=5380 width=0) (actual time=0.019..0.019 rows=4 loops=1)
Index Cond: (account_id = 769718)
Planning time: 0.223 ms
Execution time: 9677.109 ms
(19 rows)
====
After running the analyze script (see the answer below), the problem was solved. For reference, here's the new ANALYZE output (for 9.4):
Group (cost=2062.82..2062.91 rows=4 width=248) (actual time=8.775..8.801 rows=7 loops=1)
Group Key: user_library_images.id, user_library_images.imgsrc, user_library_images.library_image_id, user_library_images.type, user_library_images.is_user_uploaded, user_library_images.credit, user_library_images.orig_dimensions, user_library_images.account_id
-> Sort (cost=2062.82..2062.83 rows=4 width=248) (actual time=8.771..8.780 rows=7 loops=1)
Sort Key: user_library_images.id, user_library_images.imgsrc, user_library_images.library_image_id, user_library_images.type, user_library_images.is_user_uploaded, user_library_images.credit, user_library_images.orig_dimensions, user_library_images.account_id
Sort Method: quicksort Memory: 19kB
-> Nested Loop (cost=0.87..2062.78 rows=4 width=248) (actual time=4.156..8.685 rows=7 loops=1)
-> Index Scan using index_user_library_images_account on user_library_images (cost=0.43..469.62 rows=304 width=248) (actual time=0.319..2.528 rows=363 loops=1)
Index Cond: (account_id = 769718)
-> Index Scan using index_image_tags_user_library_image on image_tags (cost=0.43..5.23 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=363)
Index Cond: (user_library_image_id = user_library_images.id)
Filter: (tag ~~* '%stone%'::text)
Rows Removed by Filter: 2
Planning time: 2.956 ms
Execution time: 8.907 ms
(14 rows)
Limit (cost=65.81..65.81 rows=1 width=77) (actual time=0.256..0.256 rows=0 loops=1)
-> Sort (cost=65.81..65.92 rows=47 width=77) (actual time=0.252..0.252 rows=0 loops=1)
Sort Key: "timestamp"
Sort Method: quicksort Memory: 17kB
-> Index Scan using index_forum_posts_parent on forum_posts (cost=0.43..65.57 rows=47 width=77) (actual time=0.211..0.211 rows=0 loops=1)
Index Cond: (parent = 882269)
Planning time: 2.978 ms
Execution time: 0.380 ms
(8 rows)
pg_upgrade does not copy (or migrate) statistics for your database.
So you need to analyze your tables in order to update the statistics in the migrated database. pg_upgrade will create a batch file/shell script with the name analyze_new_cluster that can be used for that.
Alternatively you can use vacuum analyze manually to achieve the same thing.
The missing statistics can be detected by looking at the execution plan. The difference between the expected number of rows and the actual numbers are too high:
(cost=0.00..286.78 rows=5380 width=0) (actual time=0.019..0.019 rows=4 loops=1)
==> 5380 vs. 4 rows
or
(cost=0.00..156488.85 rows=11855 width=4) (actual time=0.301..9592.048 rows=63868 loops=1)
==> 11855 vs. 63868 rows

PostgreSQLquery speed is variable

Context
I have a table that keeps netflow data (all packets intercepted by the router).
This table features approximately 5.9 million rows at the moment.
Problem
I am trying a simple query to count the number of packets received by day, which should not take long.
The first time I run it, the query takes 88 seconds, then after a second run, 33 seconds, then 5 seconds for all subsequent runs.
The main problem is not the speed of the query, but rather that after executing the same query 3 times, the speed is nearly 20 times faster.
I understand the concept of query cache, however the performance of the original query run makes no sense to me.
Tests
The column that I am using to join (datetime) is of type timestamptz, and is indexed:
CREATE INDEX date ON netflows USING btree (datetime);
Looking at the EXPLAIN statements. The difference in execution is in the Nested Loop.
I have already VACUUM ANALYZE the table, with the exact same results.
Current environment
Linux Ubuntu 12.04 VM running on VMware ESX 4.1
PostgreSQL 9.1
VM has 2 GB RAM, 2 cores.
database server is entirely dedicated to this and is doing nothing else
inserts in the table every minute (100 rows per minute)
very low disk, ram or cpu activity
Query
with date_list as (
select
series as start_date,
series + '23:59:59' as end_date
from
generate_series(
(select min(datetime) from netflows)::date,
(select max(datetime) from netflows)::date,
'1 day') as series
)
select
start_date,
end_date,
count(*)
from
netflows
inner join date_list on (datetime between start_date and end_date)
group by
start_date,
end_date;
Explain of first run (88 seconds)
Sort (cost=27007355.59..27007356.09 rows=200 width=8) (actual time=89647.054..89647.055 rows=18 loops=1)
Sort Key: date_list.start_date
Sort Method: quicksort Memory: 25kB
CTE date_list
-> Function Scan on generate_series series (cost=0.13..12.63 rows=1000 width=8) (actual time=92.567..92.667 rows=19 loops=1)
InitPlan 2 (returns $1)
-> Result (cost=0.05..0.06 rows=1 width=0) (actual time=71.270..71.270 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.05 rows=1 width=8) (actual time=71.259..71.261 rows=1 loops=1)
-> Index Scan using date on netflows (cost=0.00..303662.15 rows=5945591 width=8) (actual time=71.252..71.252 rows=1 loops=1)
Index Cond: (datetime IS NOT NULL)
InitPlan 4 (returns $3)
-> Result (cost=0.05..0.06 rows=1 width=0) (actual time=11.786..11.787 rows=1 loops=1)
InitPlan 3 (returns $2)
-> Limit (cost=0.00..0.05 rows=1 width=8) (actual time=11.778..11.779 rows=1 loops=1)
-> Index Scan Backward using date on netflows (cost=0.00..303662.15 rows=5945591 width=8) (actual time=11.776..11.776 rows=1 loops=1)
Index Cond: (datetime IS NOT NULL)
-> HashAggregate (cost=27007333.31..27007335.31 rows=200 width=8) (actual time=89639.167..89639.179 rows=18 loops=1)
-> Nested Loop (cost=0.00..23704227.20 rows=660621222 width=8) (actual time=92.667..88059.576 rows=5945457 loops=1)
-> CTE Scan on date_list (cost=0.00..20.00 rows=1000 width=16) (actual time=92.578..92.785 rows=19 loops=1)
-> Index Scan using date on netflows (cost=0.00..13794.89 rows=660621 width=8) (actual time=2.438..4571.884 rows=312919 loops=19)
Index Cond: ((datetime >= date_list.start_date) AND (datetime <= date_list.end_date))
Total runtime: 89668.047 ms
EXPLAIN of third run (5 seconds)
Sort (cost=27011357.45..27011357.95 rows=200 width=8) (actual time=5645.031..5645.032 rows=18 loops=1)
Sort Key: date_list.start_date
Sort Method: quicksort Memory: 25kB
CTE date_list
-> Function Scan on generate_series series (cost=0.13..12.63 rows=1000 width=8) (actual time=0.108..0.204 rows=19 loops=1)
InitPlan 2 (returns $1)
-> Result (cost=0.05..0.06 rows=1 width=0) (actual time=0.050..0.050 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.05 rows=1 width=8) (actual time=0.046..0.046 rows=1 loops=1)
-> Index Scan using date on netflows (cost=0.00..303705.14 rows=5946469 width=8) (actual time=0.046..0.046 rows=1 loops=1)
Index Cond: (datetime IS NOT NULL)
InitPlan 4 (returns $3)
-> Result (cost=0.05..0.06 rows=1 width=0) (actual time=0.026..0.026 rows=1 loops=1)
InitPlan 3 (returns $2)
-> Limit (cost=0.00..0.05 rows=1 width=8) (actual time=0.026..0.026 rows=1 loops=1)
-> Index Scan Backward using date on netflows (cost=0.00..303705.14 rows=5946469 width=8) (actual time=0.026..0.026 rows=1 loops=1)
Index Cond: (datetime IS NOT NULL)
-> HashAggregate (cost=27011335.17..27011337.17 rows=200 width=8) (actual time=5645.005..5645.009 rows=18 loops=1)
-> Nested Loop (cost=0.00..23707741.28 rows=660718778 width=8) (actual time=0.134..4176.406 rows=5946329 loops=1)
-> CTE Scan on date_list (cost=0.00..20.00 rows=1000 width=16) (actual time=0.110..0.343 rows=19 loops=1)
-> Index Scan using date on netflows (cost=0.00..13796.94 rows=660719 width=8) (actual time=0.026..164.117 rows=312965 loops=19)
Index Cond: ((datetime >= date_list.start_date) AND (datetime <= date_list.end_date))
Total runtime: 5645.189 ms
If you are doing an INNER JOIN I don't think you need the CTE at all. You can define
select
datetime::date,
count(*)
from netflows
group by datetime::date /* or GROUP BY 1 as Postgres extension */
I don't see why you need the dates table unless you want a LEFT JOIN to get zeroes where appropriate. This will mean one pass through the data.
BTW, I discourage you from using keywords like date and datetime for entities and columns; even when it's legal, it's not worth it.
WITH date_list as (
SELECT t AS start_date
,(t + interval '1d') AS end_date
FROM (
SELECT generate_series((min(datetime))::date
,(max(datetime))::date
,'1d') AS t
FROM netflows
) x
)
SELECT d.start_date
,count(*) AS ct
FROM date_list d
LEFT JOIN netflows n ON n.datetime >= d.start_date
AND n.datetime < d.end_date
GROUP BY d.start_date;
And use a proper name for your index (already hinted by #Andrew):
CREATE INDEX netflows_date_idx ON netflows (datetime);
Major points
Assuming you want a row for every day of the calender, like #Andrew already mentioned on his answer, I replaced the JOIN with a LEFT JOIN.
It's much more efficient to grab min() and max() from netflows in one query.
Simplified type casting.
Fixed the date ranges. Your code would fail for timestamps like '2012-12-06 23:59:59.123'.
Tested this on a large table and performance was nice.
As to your original question: undoubtedly caching effects, which are to be expected - especially with limited RAM.

Resources