PostgreSQL: query runs very slowly on first run, is fast on subsequent runs - performance

I have a query that's running particularly slowly when first run against a particular customer's account. It then runs substantially more quickly on all subsequent runs. This issue is extremely pronounced on a spinning disc - it's still an issue, though much less annoying, on an SSD.
This is an explain of the first time the query runs: http://explain.depesz.com/s/NdTV
Aggregate (cost=1056.860..1056.870 rows=1 width=0) (actual time=356.740..356.740 rows=1 loops=1)
Output: five_two(*)
Buffers: shared hit=331 read=508
-> Nested Loop (cost=935.080..1056.860 rows=1 width=0) (actual time=292.458..356.712 rows=71 loops=1)
Buffers: shared hit=331 read=508
-> Nested Loop (cost=935.080..1051.990 rows=1 width=4) (actual time=292.440..356.116 rows=71 loops=1)
Output: xray_oscar.zulu_lima
Buffers: shared hit=136 read=490
-> HashAggregate (cost=935.080..935.090 rows=1 width=4) (actual time=282.669..282.673 rows=8 loops=1)
Output: foxtrot_echo.quebec
Buffers: shared hit=80 read=458
-> Sort (cost=935.060..935.070 rows=1 width=8) (actual time=282.661..282.662 rows=8 loops=1)
Output: foxtrot_echo.quebec, foxtrot_echo.five_echo
Sort Key: foxtrot_echo.five_echo
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=80 read=458
-> Nested Loop (cost=0.000..935.050 rows=1 width=8) (actual time=110.522..282.639 rows=8 loops=1)
Output: foxtrot_echo.quebec, foxtrot_echo.five_echo
Buffers: shared hit=80 read=458
-> Seq Scan on charlie_hotel_yankee (cost=0.000..870.560 rows=6 width=4) (actual time=1.880..5.777 rows=35 loops=1)
Output: seven_charlie.quebec, seven_charlie.foxtrot_mike, seven_charlie."foxtrot_india", seven_charlie.yankee, seven_charlie.seven_papa, seven_charlie.romeo_hotel, seven_charlie.foxtrot_two, seven_charlie.golf, seven_charlie.kilo_seven, seven_charlie.four
Filter: ((NOT seven_charlie.golf) AND (seven_charlie.yankee = 14732))
Rows Removed by Filter: 38090
Buffers: shared hit=2 read=392
-> Index Scan using xray_bravo on delta_charlie (cost=0.000..10.740 rows=1 width=12) (actual time=7.908..7.908 rows=0 loops=35)
Output: foxtrot_echo.quebec, foxtrot_echo.uniform, foxtrot_echo.seven_papa, foxtrot_echo.romeo_hotel, foxtrot_echo.five_echo
Index Cond: (foxtrot_echo.uniform = seven_charlie.quebec)
Filter: ((foxtrot_echo.five_echo >= 'three'::date) AND (foxtrot_echo.five_echo <= 'november_foxtrot_golf'::date))
Rows Removed by Filter: 7
Buffers: shared hit=78 read=66
-> Index Scan using whiskey_papa on xray_india (cost=0.000..116.890 rows=1 width=8) (actual time=1.267..9.176 rows=9 loops=8)
Output: xray_oscar.quebec, xray_oscar.foxtrot_mike, xray_oscar.foxtrot_tango, xray_oscar.seven_papa, xray_oscar.romeo_hotel, xray_oscar.zulu_lima, xray_oscar.lima_five, xray_oscar.bravo, xray_oscar.mike, xray_oscar.papa_hotel, xray_oscar.papa_romeo, delta_lima (...)
Index Cond: (xray_oscar.lima_five = foxtrot_echo.quebec)
Filter: ((xray_oscar.charlie_hotel_foxtrot & 1) = 0)
Buffers: shared hit=56 read=32
-> Index Scan using kilo_whiskey on seven_echo (cost=0.000..4.860 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=71)
Output: lima_november.quebec, lima_november.seven_papa, lima_november.romeo_hotel, lima_november.victor, lima_november.november_foxtrot_victor, lima_november.romeo_charlie, lima_november.whiskey_lima, lima_november.hotel, lima_november.sierra, lima_november.alpha, lima_november.zulu_yankee (...)
Index Cond: (lima_november.quebec = xray_oscar.zulu_lima)
Filter: (lima_november.yankee = 14732)
Buffers: shared hit=195 read=18
And here's the subsequent, fast runs: http://explain.depesz.com/s/K1J5
Aggregate (cost=1056.860..1056.870 rows=1 width=0) (actual time=5.783..5.783 rows=1 loops=1)
Output: five_two(*)
Buffers: shared hit=366 read=473
-> Nested Loop (cost=935.080..1056.860 rows=1 width=0) (actual time=5.516..5.770 rows=71 loops=1)
Buffers: shared hit=366 read=473
-> Nested Loop (cost=935.080..1051.990 rows=1 width=4) (actual time=5.505..5.622 rows=71 loops=1)
Output: xray_oscar.zulu_lima
Buffers: shared hit=156 read=470
-> HashAggregate (cost=935.080..935.090 rows=1 width=4) (actual time=5.496..5.496 rows=8 loops=1)
Output: foxtrot_echo.quebec
Buffers: shared hit=85 read=453
-> Sort (cost=935.060..935.070 rows=1 width=8) (actual time=5.490..5.491 rows=8 loops=1)
Output: foxtrot_echo.quebec, foxtrot_echo.five_echo
Sort Key: foxtrot_echo.five_echo
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=85 read=453
-> Nested Loop (cost=0.000..935.050 rows=1 width=8) (actual time=3.565..5.480 rows=8 loops=1)
Output: foxtrot_echo.quebec, foxtrot_echo.five_echo
Buffers: shared hit=85 read=453
-> Seq Scan on charlie_hotel_yankee (cost=0.000..870.560 rows=6 width=4) (actual time=1.865..5.170 rows=35 loops=1)
Output: seven_charlie.quebec, seven_charlie.foxtrot_mike, seven_charlie."foxtrot_india", seven_charlie.yankee, seven_charlie.seven_papa, seven_charlie.romeo_hotel, seven_charlie.foxtrot_two, seven_charlie.golf, seven_charlie.kilo_seven, seven_charlie.four
Filter: ((NOT seven_charlie.golf) AND (seven_charlie.yankee = 14732))
Rows Removed by Filter: 38090
Buffers: shared hit=1 read=393
-> Index Scan using xray_bravo on delta_charlie (cost=0.000..10.740 rows=1 width=12) (actual time=0.008..0.008 rows=0 loops=35)
Output: foxtrot_echo.quebec, foxtrot_echo.uniform, foxtrot_echo.seven_papa, foxtrot_echo.romeo_hotel, foxtrot_echo.five_echo
Index Cond: (foxtrot_echo.uniform = seven_charlie.quebec)
Filter: ((foxtrot_echo.five_echo >= 'three'::date) AND (foxtrot_echo.five_echo <= 'november_foxtrot_golf'::date))
Rows Removed by Filter: 7
Buffers: shared hit=84 read=60
-> Index Scan using whiskey_papa on xray_india (cost=0.000..116.890 rows=1 width=8) (actual time=0.004..0.014 rows=9 loops=8)
Output: xray_oscar.quebec, xray_oscar.foxtrot_mike, xray_oscar.foxtrot_tango, xray_oscar.seven_papa, xray_oscar.romeo_hotel, xray_oscar.zulu_lima, xray_oscar.lima_five, xray_oscar.bravo, xray_oscar.mike, xray_oscar.papa_hotel, xray_oscar.papa_romeo, delta_lima (...)
Index Cond: (xray_oscar.lima_five = foxtrot_echo.quebec)
Filter: ((xray_oscar.charlie_hotel_foxtrot & 1) = 0)
Buffers: shared hit=71 read=17
-> Index Scan using kilo_whiskey on seven_echo (cost=0.000..4.860 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=71)
Output: lima_november.quebec, lima_november.seven_papa, lima_november.romeo_hotel, lima_november.victor, lima_november.november_foxtrot_victor, lima_november.romeo_charlie, lima_november.whiskey_lima, lima_november.hotel, lima_november.sierra, lima_november.alpha, lima_november.zulu_yankee (...)
Index Cond: (lima_november.quebec = xray_oscar.zulu_lima)
Filter: (lima_november.yankee = 14732)
Buffers: shared hit=210 read=3
I know that Postgres is always going to be a little faster on subsequent queries compared to the first time it runs one, but I didn't think it would be this substantial a difference. Using other customer ID filters (lima_november.yankee in the query) I've seen one query run fast in 12ms and slow in 1873ms. So it's a pretty profound difference.
The query is selecting a count from a table with around 2 million rows, based on three filters. The first is a customer ID, done through a join, which is the index scan at the bottom of the query plans. The second is an IN query, which seems to be what index scan whiskey_papa is doing. The third is a bitwise statement - (xray_oscar.charlie_hotel_foxtrot & 1) = 0.
I haven't seen the buffers output of explain analyze before so don't really know where to start reading it. It seems odd that the Nested Loop at position 6 in the slow query has actual times so different to the lines nested inside it. But I don't really know how to go about improving that (or if that's even the main problem).
These tests are on postgres 9.2.13 on OS X 10.11.1. Any tips?

Related

Slow counts finally solved

tl;dr: |> Repo.aggregate(:count, :id) is slow, do use |> Repo.aggregate(:count)
I'm running a podcast database with > 5 million episodes. After storing a new episode, I count the episodes for a given podcast for a counter cache like this:
episodes_count = where(Episode, podcast_id: ^podcast_id)
|> Repo.aggregate(:count, :id)
This turned out to get slower and slower. So I started to dig deeper, and I realized, that
in Postgres 12 only SELECT COUNT(*) does an index only scan, while SELECT COUNT(e0.id) doesn't.
For a cold database (just restarted) even the first index scan is reasonably fast:
postgres=# \c pan_prod
Sie sind jetzt verbunden mit der Datenbank »pan_prod« als Benutzer »postgres«.
pan_prod=# EXPLAIN ANALYZE SELECT count(*) FROM "episodes" AS e0 WHERE (e0."podcast_id" = 35202);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=348.51..348.52 rows=1 width=8) (actual time=15.823..15.823 rows=1 loops=1)
-> Index Only Scan using episodes_podcast_id_index on episodes e0 (cost=0.43..323.00 rows=10204 width=0) (actual time=1.331..14.832 rows=10613 loops=1)
Index Cond: (podcast_id = 35202)
Heap Fetches: 0
Planning Time: 2.994 ms
Execution Time: 16.017 ms
It even gets faster for the second scan:
pan_prod=# EXPLAIN ANALYZE SELECT count(*) FROM "episodes" AS e0 WHERE (e0."podcast_id" = 35202);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=348.51..348.52 rows=1 width=8) (actual time=5.007..5.008 rows=1 loops=1)
-> Index Only Scan using episodes_podcast_id_index on episodes e0 (cost=0.43..323.00 rows=10204 width=0) (actual time=0.042..3.548 rows=10613 loops=1)
Index Cond: (podcast_id = 35202)
Heap Fetches: 0
Planning Time: 0.304 ms
Execution Time: 5.074 ms
While the first bitmap heap scan ist terribly slow:
pan_prod=# EXPLAIN ANALYZE SELECT count(e0.id) FROM "episodes" AS e0 WHERE (e0."podcast_id" = 35202);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=37181.71..37181.72 rows=1 width=8) (actual time=4098.525..4098.526 rows=1 loops=1)
-> Bitmap Heap Scan on episodes e0 (cost=219.51..37156.20 rows=10204 width=4) (actual time=6.508..4082.558 rows=10613 loops=1)
Recheck Cond: (podcast_id = 35202)
Heap Blocks: exact=6516
-> Bitmap Index Scan on episodes_podcast_id_index (cost=0.00..216.96 rows=10204 width=0) (actual time=3.657..3.658 rows=10613 loops=1)
Index Cond: (podcast_id = 35202)
Planning Time: 0.412 ms
Execution Time: 4098.719 ms
The second one is typically faster:
pan_prod=# EXPLAIN ANALYZE SELECT count(e0.id) FROM "episodes" AS e0 WHERE (e0."podcast_id" = 35202);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=37181.71..37181.72 rows=1 width=8) (actual time=18.857..18.857 rows=1 loops=1)
-> Bitmap Heap Scan on episodes e0 (cost=219.51..37156.20 rows=10204 width=4) (actual time=6.047..17.152 rows=10613 loops=1)
Recheck Cond: (podcast_id = 35202)
Heap Blocks: exact=6516
-> Bitmap Index Scan on episodes_podcast_id_index (cost=0.00..216.96 rows=10204 width=0) (actual time=3.738..3.738 rows=10613 loops=1)
Index Cond: (podcast_id = 35202)
Planning Time: 0.322 ms
Execution Time: 18.999 ms
I don't get, why SELECT count(e0.id) doesn't use the index, and I would like to know why.
I always thought, I should prefer it, as only one column is looked at, but that is not the case, it seems.
I don't get, why SELECT count(e0.id) doesn't use the index, and I would like to know why. I always thought, I should prefer it, as only one column is looked at, but that is not the case, it seems.
It does use the index episodes_podcast_id_index both times.
Once it can do without additionally looking in the table, the other time it does look into the table.
You didn't provide the index defintion, but it seems like id is not part of it. I would also supect that it is not a not nullable column (it might contains null values). Thus, the DB has to fetch the id column to see if it is null.
Why?
Because count(*) counts rows while count(<expression>) counts not-null values.
See: https://modern-sql.com/concept/null#aggregates

Query runs much slower using JDBC

I have two different queries that take about the same amount of time to execute when I timed with Adminer or DBeaver
Query one
select * from state where state_name = 'Florida';
When I run the query above in Adminer it takes anywhere from
0.032 s to 0.058 s
EXPLAIN ANALYZE
Seq Scan on state (cost=0.00..3981.50 rows=1 width=28) (actual time=1.787..15.047 rows=1 loops=1)
Filter: (state_name = 'Florida'::citext)
Rows Removed by Filter: 50
Planning Time: 0.486 ms
Execution Time: 15.779 ms
Query two
select
property.id as property_id ,
full_address,
street_address,
street.street,
city.city as city,
state.state_code as state_code,
zipcode.zipcode as zipcode
from
property
inner join street on
street.id = property.street_id
inner join city on
city.id = property.city_id
inner join state on
state.id = property.state_id
inner join zipcode on
zipcode.id = property.zipcode_id
where
full_address = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211';
The above query takes from
0.025 s to 0.048 s
EXPLAIN ANALYZE
Nested Loop (cost=29.82..65.96 rows=1 width=97) (actual time=0.668..0.671 rows=1 loops=1)
-> Nested Loop (cost=29.53..57.65 rows=1 width=107) (actual time=0.617..0.620 rows=1 loops=1)
-> Nested Loop (cost=29.25..49.30 rows=1 width=120) (actual time=0.582..0.585 rows=1 loops=1)
-> Nested Loop (cost=28.97..41.00 rows=1 width=127) (actual time=0.532..0.534 rows=1 loops=1)
-> Bitmap Heap Scan on property (cost=28.54..32.56 rows=1 width=131) (actual time=0.454..0.456 rows=1 loops=1)
Recheck Cond: (full_address = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211'::citext)
Heap Blocks: exact=1
-> Bitmap Index Scan on property_full_address (cost=0.00..28.54 rows=1 width=0) (actual time=0.426..0.426 rows=1 loops=1)
Index Cond: (full_address = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211'::citext)
-> Index Scan using street_pkey on street (cost=0.42..8.44 rows=1 width=28) (actual time=0.070..0.070 rows=1 loops=1)
Index Cond: (id = property.street_id)
-> Index Scan using city_id_pk on city (cost=0.29..8.30 rows=1 width=25) (actual time=0.047..0.047 rows=1 loops=1)
Index Cond: (id = property.city_id)
-> Index Scan using state_id_pk on state (cost=0.28..8.32 rows=1 width=19) (actual time=0.032..0.032 rows=1 loops=1)
Index Cond: (id = property.state_id)
-> Index Scan using zipcode_id_pk on zipcode (cost=0.29..8.30 rows=1 width=22) (actual time=0.048..0.048 rows=1 loops=1)
Index Cond: (id = property.zipcode_id)
Planning Time: 5.473 ms
Execution Time: 1.601 ms
I have the following methods which uses JDBCTemplate to execute the same queries.
Query one
public void performanceTest(String str) {
template.queryForObject(
"select * from state where state_name = ?",
new Object[] { str }, (result, rowNum) -> {
return result.getObject("state_name");
});
}
time: 140ms, which is 0.14 seconds
Query two
public void performanceTest(String str) {
template.queryForObject(
"SELECT property.id AS property_id , full_address, street_address, street.street, city.city as city, state.state_code as state_code, zipcode.zipcode as zipcode FROM property INNER JOIN street ON street.id = property.street_id INNER JOIN city ON city.id = property.city_id INNER JOIN state ON state.id = property.state_id INNER JOIN zipcode ON zipcode.id = property.zipcode_id WHERE full_address = ?",
new Object[] { str }, (result, rowNum) -> {
return result.getObject("property_id");
});
}
The time it takes to execute the method above is
time: 828 ms, which is 0.825 seconds
I am timing the method's execution time using this code below
long startTime1 = System.nanoTime();
propertyRepo.performanceTest(address); //or "Florida" depending which query I'm testing
long endTime1 = System.nanoTime();
long duration1 = TimeUnit.MILLISECONDS.convert((endTime1 - startTime1), TimeUnit.NANOSECONDS);
System.out.println("time: " + duration1);
Why is query two so much slower when I run it from JDBC compared to when I run it from Adminer? Anything I can do to improve the performance for query two?
EDIT:
I created two different PHP scripts containing the queries respectively. They take the same amount of time using PHP, so I assume it has something to do with JDBC? Below is the result of the PHP scripts. The time PHP takes is a higher than Java takes with Query one since I am not using any connection pooling. But both queries are taking pretty much the same amount of time to execute. Something is causing a delay with Query two on JDBC.
EDIT:
When I run the query using prepared statement it's slow. But it's fast when I run it with statement. I did EXPLAIN ANALYZE for both, using preparedStatement and statement
preparedStatement explain analyze
Nested Loop (cost=1.27..315241.91 rows=1 width=97) (actual time=0.091..688.583 rows=1 loops=1)
-> Nested Loop (cost=0.98..315233.61 rows=1 width=107) (actual time=0.079..688.571 rows=1 loops=1)
-> Nested Loop (cost=0.71..315225.26 rows=1 width=120) (actual time=0.069..688.561 rows=1 loops=1)
-> Nested Loop (cost=0.42..315216.95 rows=1 width=127) (actual time=0.057..688.548 rows=1 loops=1)
-> Seq Scan on property (cost=0.00..315208.51 rows=1 width=131) (actual time=0.032..688.522 rows=1 loops=1)
Filter: ((full_address)::text = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211'::text)
Rows Removed by Filter: 8790
-> Index Scan using street_pkey on street (cost=0.42..8.44 rows=1 width=28) (actual time=0.019..0.019 rows=1 loops=1)
Index Cond: (id = property.street_id)
-> Index Scan using city_id_pk on city (cost=0.29..8.30 rows=1 width=25) (actual time=0.010..0.010 rows=1 loops=1)
Index Cond: (id = property.city_id)
-> Index Scan using state_id_pk on state (cost=0.28..8.32 rows=1 width=19) (actual time=0.008..0.008 rows=1 loops=1)
Index Cond: (id = property.state_id)
-> Index Scan using zipcode_id_pk on zipcode (cost=0.29..8.30 rows=1 width=22) (actual time=0.010..0.010 rows=1 loops=1)
Index Cond: (id = property.zipcode_id)
Planning Time: 2.400 ms
Execution Time: 688.674 ms
statement explain analyze
Nested Loop (cost=29.82..65.96 rows=1 width=97) (actual time=0.232..0.235 rows=1 loops=1)
-> Nested Loop (cost=29.53..57.65 rows=1 width=107) (actual time=0.220..0.223 rows=1 loops=1)
-> Nested Loop (cost=29.25..49.30 rows=1 width=120) (actual time=0.211..0.213 rows=1 loops=1)
-> Nested Loop (cost=28.97..41.00 rows=1 width=127) (actual time=0.198..0.200 rows=1 loops=1)
-> Bitmap Heap Scan on property (cost=28.54..32.56 rows=1 width=131) (actual time=0.175..0.177 rows=1 loops=1)
Recheck Cond: (full_address = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211'::citext)
Heap Blocks: exact=1
-> Bitmap Index Scan on property_full_address (cost=0.00..28.54 rows=1 width=0) (actual time=0.162..0.162 rows=1 loops=1)
Index Cond: (full_address = '139-Skillman-Ave-Apt-5C-Brooklyn-NY-11211'::citext)
-> Index Scan using street_pkey on street (cost=0.42..8.44 rows=1 width=28) (actual time=0.017..0.017 rows=1 loops=1)
Index Cond: (id = property.street_id)
-> Index Scan using city_id_pk on city (cost=0.29..8.30 rows=1 width=25) (actual time=0.010..0.010 rows=1 loops=1)
Index Cond: (id = property.city_id)
-> Index Scan using state_id_pk on state (cost=0.28..8.32 rows=1 width=19) (actual time=0.007..0.007 rows=1 loops=1)
Index Cond: (id = property.state_id)
-> Index Scan using zipcode_id_pk on zipcode (cost=0.29..8.30 rows=1 width=22) (actual time=0.010..0.010 rows=1 loops=1)
Index Cond: (id = property.zipcode_id)
Planning Time: 2.442 ms
Execution Time: 0.345 ms
It's because of the connection pool that is used by the different clients.
You can setup a fast connection pool like HikariC for JDBC like this:
public class HikariCPDataSource {
private static HikariConfig config = new HikariConfig();
private static HikariDataSource ds;
static {
config.setJdbcUrl("jdbc:h2:mem:test");
config.setUsername("user");
config.setPassword("password");
config.addDataSourceProperty("cachePrepStmts", "true");
config.addDataSourceProperty("prepStmtCacheSize", "250");
config.addDataSourceProperty("prepStmtCacheSqlLimit", "2048");
ds = new HikariDataSource(config);
}
public static Connection getConnection() throws SQLException {
return ds.getConnection();
}
private HikariCPDataSource(){}
}

Queries running very slow on first load on PostgreSQL

We are using a PostgreSQL version 9.4 database on Amazon EC2. All of our queries run super slow on the first try, until it gets cached after that they are quite quick however it is not a conciliation as it slows down the page load.
Here in one of the queries we use:
SELECT HE.fs_perm_sec_id,
HE.TICKER_EXCHANGE,
HE.proper_name,
OP.shares_outstanding,
(SELECT factset_industry_desc
FROM factset_industry_map AS fim
WHERE fim.factset_industry_code = HES.industry_code) AS industry,
(SELECT SUM(POSITION) AS ST_HOLDINGS
FROM OWN_STAKES_HOLDINGS S
WHERE S.POSITION > 0
AND S.fs_perm_sec_id = HE.fs_perm_sec_id
GROUP BY FS_PERM_SEC_ID) AS stake_holdings,
(SELECT SUM(CURRENT_HOLDINGS)
FROM
(SELECT CURRENT_HOLDINGS
FROM OWN_INST_HOLDINGS IHT
WHERE FS_PERM_SEC_ID=HE.FS_PERM_SEC_ID
ORDER BY CURRENT_HOLDINGS DESC LIMIT 10)A) AS top_10_inst_hodings,
(SELECT SUM(OIH.current_holdings)
FROM own_inst_holdings OIH
WHERE OIH.fs_perm_sec_id = HE.fs_perm_sec_id) AS inst_holdings
FROM own_prices OP
JOIN h_security_ticker_exchange HE ON OP.fs_perm_sec_id = HE.fs_perm_sec_id
JOIN h_entity_sector HES ON HES.factset_entity_id = HE.factset_entity_id
WHERE HE.ticker_exchange = 'PG-NYS'
ORDER BY OP.price_date DESC LIMIT 1
Ran an EXPLAIN ANALYSE and received the following results:
QUERY PLAN
Limit (cost=223.39..223.39 rows=1 width=100) (actual time=2420.644..2420.645 rows=1 loops=1)
-> Sort (cost=223.39..223.39 rows=1 width=100) (actual time=2420.643..2420.643 rows=1 loops=1)
Sort Key: op.price_date
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.26..223.39 rows=1 width=100) (actual time=2316.169..2420.566 rows=36 loops=1)
-> Nested Loop (cost=0.17..8.87 rows=1 width=104) (actual time=3.958..5.084 rows=36 loops=1)
-> Index Scan using h_sec_exch_factset_entity_id_idx on h_security_ticker_exchange he (cost=0.09..4.09 rows=1 width=92) (actual time=1.452..1.454 rows=1 loops=1)
Index Cond: ((ticker_exchange)::text = 'PG-NYS'::text)
-> Index Scan using alex_prices on own_prices op (cost=0.09..4.68 rows=33 width=23) (actual time=2.496..3.592 rows=36 loops=1)
Index Cond: ((fs_perm_sec_id)::text = (he.fs_perm_sec_id)::text)
-> Index Scan using alex_factset_entity_idx on h_entity_sector hes (cost=0.09..4.09 rows=1 width=14) (actual time=0.076..0.077 rows=1 loops=36)
Index Cond: (factset_entity_id = he.factset_entity_id)
SubPlan 1
-> Index Only Scan using alex_factset_industry_code_idx on factset_industry_map fim (cost=0.03..2.03 rows=1 width=20) (actual time=0.006..0.007 rows=1 loops=36)
Index Cond: (factset_industry_code = hes.industry_code)
Heap Fetches: 0
SubPlan 2
-> GroupAggregate (cost=0.08..2.18 rows=2 width=17) (actual time=0.735..0.735 rows=1 loops=36)
Group Key: s.fs_perm_sec_id
-> Index Only Scan using own_stakes_holdings_perm_position_idx on own_stakes_holdings s (cost=0.08..2.15 rows=14 width=17) (actual time=0.080..0.713 rows=39 loops=36)
Index Cond: ((fs_perm_sec_id = (he.fs_perm_sec_id)::text) AND (\position\ > 0::numeric))
Heap Fetches: 1155
SubPlan 3
-> Aggregate (cost=11.25..11.26 rows=1 width=6) (actual time=0.166..0.166 rows=1 loops=36)
-> Limit (cost=0.09..11.22 rows=10 width=6) (actual time=0.081..0.150 rows=10 loops=36)
-> Index Only Scan Backward using alex_current_holdings_idx on own_inst_holdings iht (cost=0.09..194.87 rows=175 width=6) (actual time=0.080..0.147 rows=10 loops=36)
Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
Heap Fetches: 288
SubPlan 4
-> Aggregate (cost=194.96..194.96 rows=1 width=6) (actual time=66.102..66.102 rows=1 loops=36)
-> Index Only Scan using alex_current_holdings_idx on own_inst_holdings oih (cost=0.09..194.87 rows=175 width=6) (actual time=0.060..65.209 rows=2505 loops=36)
Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
Heap Fetches: 33453
Planning time: 1.581 ms
Execution time: 2420.830 ms
Once we disable the SELECT SUM() for the 3 aggregates it speeds up considerably but it defeats the point of having a relational DB.
We are running the queries with NodeJS using PG plugin (https://www.npmjs.com/package/pg) to connect and run the queries on the DB
How can we speed up the queries? What additional steps we could take? We have already indexed the DB and all the fields seems to be indexed properly but still it not fast enough.
Any help, comments and /or suggestions are appreciated.
Nested loops with aggregates are generally a bad thing. The below should avoid that. (Untested; a SQLFiddle would have been helpful.) Give this a spin and let me know. I'm curious how the engine plays with the window function filter.
WITH security
AS (
SELECT HE.fs_perm_sec_id
, HE.TICKER_EXCHANGE
, HE.proper_name
, OP.shares_outstanding
, OP.price_date
FROM own_prices AS OP
JOIN h_security_ticker_exchange AS HE
ON OP.fs_perm_sec_id = HE.fs_perm_sec_id
JOIN h_entity_sector AS HES
ON HES.factset_entity_id = HE.factset_entity_id
WHERE HE.ticker_exchange = 'PG-NYS'
)
SELECT SE.fs_perm_sec_id
, SE.TICKER_EXCHANGE
, SE.proper_name
, SE.shares_outstanding
, S.stake_holdings
, IHT.top_10_inst_holdings
, OIH.inst_holdings
FROM security SE
JOIN (
SELECT S.fs_perm_sec_id
, SUM(S.POSITION) AS stake_holdings
FROM OWN_STAKES_HOLDINGS AS S
WHERE S.fs_perm_sec_id IN (
SELECT fs_perm_sec_id
FROM security
)
AND S.POSITION > 0
GROUP BY S.fs_perm_sec_id
) AS S
ON SE.fs_perm_sec_id = S.fs_perm_sec_id
JOIN (
SELECT IHT.FS_PERM_SEC_ID
, SUM(IHT.CURRENT_HOLDINGS) AS top_10_inst_holdings
FROM OWN_INST_HOLDINGS AS IHT
WHERE IHT.FS_PERM_SEC_ID IN (
SELECT fs_perm_sec_id
FROM security
)
AND ROW_NUMBER() OVER (
PARTITION BY IHT.FS_PERM_SEC_ID
ORDER BY IHT.CURRENT_HOLDINGS DESC
) <= 10
GROUP BY IHT.FS_PERM_SEC_ID
) AS IHT
ON SE.fs_perm_sec_id = IHT.fs_perm_sec_id
JOIN (
SELECT S.fs_perm_sec_id
, SUM(OIH.current_holdings) AS inst_holdings
FROM own_inst_holdings AS OIH
WHERE OIH.fs_perm_sec_id IN (
SELECT fs_perm_sec_id
FROM security
)
GROUP BY OIH.fs_perm_sec_id
) AS OIH
ON SE.fs_perm_sec_id = OIH.fs_perm_sec_id
ORDER BY SE.price_date
LIMIT 1

Slow performance after upgrading PostreSQL from 9.1 to 9.4

I'm getting extremely slow performance after upgrading Postgres 9.1 to 9.4. Here's an example of a two queries which are running significantly more slowly.
Note: I realize that these queries might be able to be rewritten to work more efficiently, however the main thing I'm concerned about is that after upgrading to a newer version of Postgres, they are suddenly running 100x more slowly! I'm hoping there's a configuration variable someplace I've overlooked.
While doing the upgrade I used the pg_upgrade command with the --link option. The configuration file is the same between 9.4 and 9.1. It's not running on the exact same hardware, but they're both running on a Linode and I've tried using 3 different Linodes now for the new server, so I don't think this is a hardware issue.
It seems like in both cases, 9.4 is using different indexes than 9.1?
9.1:
EXPLAIN ANALYZE SELECT "id", "title", "timestamp", "parent", "deleted", "sunk", "closed", "sticky", "lastupdate", "views", "oldid", "editedon", "devpost", "hideblue", "totalvotes", "statustag", "forum_category_id", "account_id" FROM "forum_posts" WHERE "parent" = 882269 ORDER BY "timestamp" DESC LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=63.87..63.87 rows=1 width=78) (actual time=0.020..0.020 rows=0 loops=1)
-> Sort (cost=63.87..63.98 rows=45 width=78) (actual time=0.018..0.018 rows=0 loops=1)
Sort Key: "timestamp"
Sort Method: quicksort Memory: 17kB
-> Index Scan using index_forum_posts_parent on forum_posts (cost=0.00..63.65 rows=45 width=78) (actual time=0.013..0.013 rows=0 loops=1)
Index Cond: (parent = 882269)
Total runtime: 0.074 ms
(7 rows)
9.4:
EXPLAIN ANALYZE SELECT "id", "title", "timestamp", "parent", "deleted", "sunk", "closed", "sticky", "lastupdate", "views", "oldid", "editedon", "devpost", "hideblue", "totalvotes", "statustag", "forum_category_id", "account_id" FROM "forum_posts" WHERE "parent" = 882269 ORDER BY "timestamp" DESC LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..63.48 rows=1 width=1078) (actual time=920.484..920.484 rows=0 loops=1)
-> Index Scan Backward using forum_posts_timestamp_index on forum_posts (cost=0.42..182622.07 rows=2896 width=1078) (actual time=920.480..920.480 rows=0 loops=1)
Filter: (parent = 882269)
Rows Removed by Filter: 1576382
Planning time: 0.166 ms
Execution time: 920.521 ms
(6 rows)
9.1:
EXPLAIN ANALYZE SELECT "user_library_images"."id", "user_library_images"."imgsrc", "user_library_images"."library_image_id", "user_library_images"."type", "user_library_images"."is_user_uploaded", "user_library_images"."credit", "user_library_images"."orig_dimensions", "user_library_images"."account_id" FROM "user_library_images" INNER JOIN "image_tags" ON "user_library_images"."id" = "image_tags"."user_library_image_id" WHERE ("user_library_images"."account_id" = 769718 AND "image_tags"."tag" ILIKE '%stone%') GROUP BY "user_library_images"."id", "user_library_images"."imgsrc", "user_library_images"."library_image_id", "user_library_images"."type", "user_library_images"."is_user_uploaded", "user_library_images"."credit", "user_library_images"."orig_dimensions", "user_library_images"."account_id" ORDER BY "user_library_images"."id";
Group (cost=2015.46..2015.49 rows=1 width=247) (actual time=0.629..0.652 rows=6 loops=1)
-> Sort (cost=2015.46..2015.47 rows=1 width=247) (actual time=0.626..0.632 rows=6 loops=1)
Sort Key: user_library_images.id, user_library_images.imgsrc, user_library_images.library_image_id, user_library_images.type, user_library_images.is_user_uploaded, user_library_images.credit, user_library_images.orig_dimensions, user_library_images.account_id
Sort Method: quicksort Memory: 19kB
-> Nested Loop (cost=0.00..2015.45 rows=1 width=247) (actual time=0.283..0.603 rows=6 loops=1)
-> Index Scan using index_user_library_images_account on user_library_images (cost=0.00..445.57 rows=285 width=247) (actual time=0.076..0.273 rows=13 loops=1)
Index Cond: (account_id = 769718)
-> Index Scan using index_image_tags_user_library_image on image_tags (cost=0.00..5.50 rows=1 width=4) (actual time=0.020..0.021 rows=0 loops=13)
Index Cond: (user_library_image_id = user_library_images.id)
Filter: (tag ~~* '%stone%'::text)
Total runtime: 0.697 ms
(11 rows)
9.4:
Group (cost=166708.13..166709.46 rows=59 width=1241) (actual time=9677.052..9677.052 rows=0 loops=1)
Group Key: user_library_images.id, user_library_images.imgsrc, user_library_images.library_image_id, user_library_images.type, user_library_images.is_user_uploaded, user_library_images.credit, user_library_images.orig_dimensions, user_library_images.account_id
-> Sort (cost=166708.13..166708.28 rows=59 width=1241) (actual time=9677.049..9677.049 rows=0 loops=1)
Sort Key: user_library_images.id, user_library_images.imgsrc, user_library_images.library_image_id, user_library_images.type, user_library_images.is_user_uploaded, user_library_images.credit, user_library_images.orig_dimensions, user_library_images.account_id
Sort Method: quicksort Memory: 17kB
-> Hash Join (cost=10113.22..166706.39 rows=59 width=1241) (actual time=9677.035..9677.035 rows=0 loops=1)
Hash Cond: (image_tags.user_library_image_id = user_library_images.id)
-> Seq Scan on image_tags (cost=0.00..156488.85 rows=11855 width=4) (actual time=0.301..9592.048 rows=63868 loops=1)
Filter: (tag ~~* '%stone%'::text)
Rows Removed by Filter: 9370406
-> Hash (cost=10045.97..10045.97 rows=5380 width=1241) (actual time=0.047..0.047 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Bitmap Heap Scan on user_library_images (cost=288.12..10045.97 rows=5380 width=1241) (actual time=0.027..0.037 rows=4 loops=1)
Recheck Cond: (account_id = 769718)
Heap Blocks: exact=4
-> Bitmap Index Scan on index_user_library_images_account (cost=0.00..286.78 rows=5380 width=0) (actual time=0.019..0.019 rows=4 loops=1)
Index Cond: (account_id = 769718)
Planning time: 0.223 ms
Execution time: 9677.109 ms
(19 rows)
====
After running the analyze script (see the answer below), the problem was solved. For reference, here's the new ANALYZE output (for 9.4):
Group (cost=2062.82..2062.91 rows=4 width=248) (actual time=8.775..8.801 rows=7 loops=1)
Group Key: user_library_images.id, user_library_images.imgsrc, user_library_images.library_image_id, user_library_images.type, user_library_images.is_user_uploaded, user_library_images.credit, user_library_images.orig_dimensions, user_library_images.account_id
-> Sort (cost=2062.82..2062.83 rows=4 width=248) (actual time=8.771..8.780 rows=7 loops=1)
Sort Key: user_library_images.id, user_library_images.imgsrc, user_library_images.library_image_id, user_library_images.type, user_library_images.is_user_uploaded, user_library_images.credit, user_library_images.orig_dimensions, user_library_images.account_id
Sort Method: quicksort Memory: 19kB
-> Nested Loop (cost=0.87..2062.78 rows=4 width=248) (actual time=4.156..8.685 rows=7 loops=1)
-> Index Scan using index_user_library_images_account on user_library_images (cost=0.43..469.62 rows=304 width=248) (actual time=0.319..2.528 rows=363 loops=1)
Index Cond: (account_id = 769718)
-> Index Scan using index_image_tags_user_library_image on image_tags (cost=0.43..5.23 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=363)
Index Cond: (user_library_image_id = user_library_images.id)
Filter: (tag ~~* '%stone%'::text)
Rows Removed by Filter: 2
Planning time: 2.956 ms
Execution time: 8.907 ms
(14 rows)
Limit (cost=65.81..65.81 rows=1 width=77) (actual time=0.256..0.256 rows=0 loops=1)
-> Sort (cost=65.81..65.92 rows=47 width=77) (actual time=0.252..0.252 rows=0 loops=1)
Sort Key: "timestamp"
Sort Method: quicksort Memory: 17kB
-> Index Scan using index_forum_posts_parent on forum_posts (cost=0.43..65.57 rows=47 width=77) (actual time=0.211..0.211 rows=0 loops=1)
Index Cond: (parent = 882269)
Planning time: 2.978 ms
Execution time: 0.380 ms
(8 rows)
pg_upgrade does not copy (or migrate) statistics for your database.
So you need to analyze your tables in order to update the statistics in the migrated database. pg_upgrade will create a batch file/shell script with the name analyze_new_cluster that can be used for that.
Alternatively you can use vacuum analyze manually to achieve the same thing.
The missing statistics can be detected by looking at the execution plan. The difference between the expected number of rows and the actual numbers are too high:
(cost=0.00..286.78 rows=5380 width=0) (actual time=0.019..0.019 rows=4 loops=1)
==> 5380 vs. 4 rows
or
(cost=0.00..156488.85 rows=11855 width=4) (actual time=0.301..9592.048 rows=63868 loops=1)
==> 11855 vs. 63868 rows

PostgreSQLquery speed is variable

Context
I have a table that keeps netflow data (all packets intercepted by the router).
This table features approximately 5.9 million rows at the moment.
Problem
I am trying a simple query to count the number of packets received by day, which should not take long.
The first time I run it, the query takes 88 seconds, then after a second run, 33 seconds, then 5 seconds for all subsequent runs.
The main problem is not the speed of the query, but rather that after executing the same query 3 times, the speed is nearly 20 times faster.
I understand the concept of query cache, however the performance of the original query run makes no sense to me.
Tests
The column that I am using to join (datetime) is of type timestamptz, and is indexed:
CREATE INDEX date ON netflows USING btree (datetime);
Looking at the EXPLAIN statements. The difference in execution is in the Nested Loop.
I have already VACUUM ANALYZE the table, with the exact same results.
Current environment
Linux Ubuntu 12.04 VM running on VMware ESX 4.1
PostgreSQL 9.1
VM has 2 GB RAM, 2 cores.
database server is entirely dedicated to this and is doing nothing else
inserts in the table every minute (100 rows per minute)
very low disk, ram or cpu activity
Query
with date_list as (
select
series as start_date,
series + '23:59:59' as end_date
from
generate_series(
(select min(datetime) from netflows)::date,
(select max(datetime) from netflows)::date,
'1 day') as series
)
select
start_date,
end_date,
count(*)
from
netflows
inner join date_list on (datetime between start_date and end_date)
group by
start_date,
end_date;
Explain of first run (88 seconds)
Sort (cost=27007355.59..27007356.09 rows=200 width=8) (actual time=89647.054..89647.055 rows=18 loops=1)
Sort Key: date_list.start_date
Sort Method: quicksort Memory: 25kB
CTE date_list
-> Function Scan on generate_series series (cost=0.13..12.63 rows=1000 width=8) (actual time=92.567..92.667 rows=19 loops=1)
InitPlan 2 (returns $1)
-> Result (cost=0.05..0.06 rows=1 width=0) (actual time=71.270..71.270 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.05 rows=1 width=8) (actual time=71.259..71.261 rows=1 loops=1)
-> Index Scan using date on netflows (cost=0.00..303662.15 rows=5945591 width=8) (actual time=71.252..71.252 rows=1 loops=1)
Index Cond: (datetime IS NOT NULL)
InitPlan 4 (returns $3)
-> Result (cost=0.05..0.06 rows=1 width=0) (actual time=11.786..11.787 rows=1 loops=1)
InitPlan 3 (returns $2)
-> Limit (cost=0.00..0.05 rows=1 width=8) (actual time=11.778..11.779 rows=1 loops=1)
-> Index Scan Backward using date on netflows (cost=0.00..303662.15 rows=5945591 width=8) (actual time=11.776..11.776 rows=1 loops=1)
Index Cond: (datetime IS NOT NULL)
-> HashAggregate (cost=27007333.31..27007335.31 rows=200 width=8) (actual time=89639.167..89639.179 rows=18 loops=1)
-> Nested Loop (cost=0.00..23704227.20 rows=660621222 width=8) (actual time=92.667..88059.576 rows=5945457 loops=1)
-> CTE Scan on date_list (cost=0.00..20.00 rows=1000 width=16) (actual time=92.578..92.785 rows=19 loops=1)
-> Index Scan using date on netflows (cost=0.00..13794.89 rows=660621 width=8) (actual time=2.438..4571.884 rows=312919 loops=19)
Index Cond: ((datetime >= date_list.start_date) AND (datetime <= date_list.end_date))
Total runtime: 89668.047 ms
EXPLAIN of third run (5 seconds)
Sort (cost=27011357.45..27011357.95 rows=200 width=8) (actual time=5645.031..5645.032 rows=18 loops=1)
Sort Key: date_list.start_date
Sort Method: quicksort Memory: 25kB
CTE date_list
-> Function Scan on generate_series series (cost=0.13..12.63 rows=1000 width=8) (actual time=0.108..0.204 rows=19 loops=1)
InitPlan 2 (returns $1)
-> Result (cost=0.05..0.06 rows=1 width=0) (actual time=0.050..0.050 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.05 rows=1 width=8) (actual time=0.046..0.046 rows=1 loops=1)
-> Index Scan using date on netflows (cost=0.00..303705.14 rows=5946469 width=8) (actual time=0.046..0.046 rows=1 loops=1)
Index Cond: (datetime IS NOT NULL)
InitPlan 4 (returns $3)
-> Result (cost=0.05..0.06 rows=1 width=0) (actual time=0.026..0.026 rows=1 loops=1)
InitPlan 3 (returns $2)
-> Limit (cost=0.00..0.05 rows=1 width=8) (actual time=0.026..0.026 rows=1 loops=1)
-> Index Scan Backward using date on netflows (cost=0.00..303705.14 rows=5946469 width=8) (actual time=0.026..0.026 rows=1 loops=1)
Index Cond: (datetime IS NOT NULL)
-> HashAggregate (cost=27011335.17..27011337.17 rows=200 width=8) (actual time=5645.005..5645.009 rows=18 loops=1)
-> Nested Loop (cost=0.00..23707741.28 rows=660718778 width=8) (actual time=0.134..4176.406 rows=5946329 loops=1)
-> CTE Scan on date_list (cost=0.00..20.00 rows=1000 width=16) (actual time=0.110..0.343 rows=19 loops=1)
-> Index Scan using date on netflows (cost=0.00..13796.94 rows=660719 width=8) (actual time=0.026..164.117 rows=312965 loops=19)
Index Cond: ((datetime >= date_list.start_date) AND (datetime <= date_list.end_date))
Total runtime: 5645.189 ms
If you are doing an INNER JOIN I don't think you need the CTE at all. You can define
select
datetime::date,
count(*)
from netflows
group by datetime::date /* or GROUP BY 1 as Postgres extension */
I don't see why you need the dates table unless you want a LEFT JOIN to get zeroes where appropriate. This will mean one pass through the data.
BTW, I discourage you from using keywords like date and datetime for entities and columns; even when it's legal, it's not worth it.
WITH date_list as (
SELECT t AS start_date
,(t + interval '1d') AS end_date
FROM (
SELECT generate_series((min(datetime))::date
,(max(datetime))::date
,'1d') AS t
FROM netflows
) x
)
SELECT d.start_date
,count(*) AS ct
FROM date_list d
LEFT JOIN netflows n ON n.datetime >= d.start_date
AND n.datetime < d.end_date
GROUP BY d.start_date;
And use a proper name for your index (already hinted by #Andrew):
CREATE INDEX netflows_date_idx ON netflows (datetime);
Major points
Assuming you want a row for every day of the calender, like #Andrew already mentioned on his answer, I replaced the JOIN with a LEFT JOIN.
It's much more efficient to grab min() and max() from netflows in one query.
Simplified type casting.
Fixed the date ranges. Your code would fail for timestamps like '2012-12-06 23:59:59.123'.
Tested this on a large table and performance was nice.
As to your original question: undoubtedly caching effects, which are to be expected - especially with limited RAM.

Resources