LOWER LIKE vs iLIKE - performance

How does the performance of the following two query components compare?
LOWER LIKE
... LOWER(description) LIKE '%abcde%' ...
iLIKE
... description iLIKE '%abcde%' ...

The answer depends on many factors like Postgres version, encoding and locale - LC_COLLATE in particular.
The bare expression lower(description) LIKE '%abc%' is typically a bit faster than description ILIKE '%abc%', and either is a bit faster than the equivalent regular expression: description ~* 'abc'. This matters for sequential scans where the expression has to be evaluated for every tested row.
But for big tables like you demonstrate in your answer one would certainly use an index. For arbitrary patterns (not only left-anchored) I suggest a trigram index using the additional module pg_trgm. Then we talk about milliseconds instead of seconds and the difference between the above expressions is nullified.
GIN and GiST indexes (using the gin_trgm_ops or gist_trgm_ops operator classes) support LIKE (~~), ILIKE (~~*), ~, ~* (and some more variants) alike. With a trigram GIN index on description (typically bigger than GiST, but faster for reads), your query would use description ILIKE 'case_insensitive_pattern'.
Related:
PostgreSQL LIKE query performance variations
Similar UTF-8 strings for autocomplete field
Basics for pattern matching in Postgres:
Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL
When working with said trigram index it's typically more practical to work with:
description ILIKE '%abc%'
Or with the case-insensitive regexp operator (without % wildcards):
description ~* 'abc'
An index on (description) does not support queries on lower(description) like:
lower(description) LIKE '%abc%'
And vice versa.
With predicates on lower(description) exclusively, the expression index is the slightly better option.
In all other cases, an index on (description) is preferable as it supports both case-sensitive and -insensitive predicates.

According to my tests (ten of each query), LOWER LIKE is about 17% faster than iLIKE.
Explanation
I created a million rows contain some random mixed text data:
require 'securerandom'
inserts = []
1000000.times do |i|
inserts << "(1, 'fake', '#{SecureRandom.urlsafe_base64(64)}')"
end
sql = "insert into books (user_id, title, description) values #{inserts.join(', ')}"
ActiveRecord::Base.connection.execute(sql)
Verify the number of rows:
my_test_db=# select count(id) from books ;
count
---------
1000009
(Yes, I have nine extra rows from other tests - not a problem.)
Example query and results:
my_test_db=# SELECT "books".* FROM "books" WHERE "books"."published" = 'f'
my_test_db=# and (LOWER(description) LIKE '%abcde%') ;
id | user_id | title | description | published
---------+---------+-------+----------------------------------------------------------------------------------------+------
1232322 | 1 | fake | 5WRGr7oCKABcdehqPKsUqV8ji61rsNGS1TX6pW5LJKrspOI_ttLNbaSyRz1BwTGQxp3OaxW7Xl6fzVpCu9y3fA | f
1487103 | 1 | fake | J6q0VkZ8-UlxIMZ_MFU_wsz_8MP3ZBQvkUo8-2INiDIp7yCZYoXqRyp1Lg7JyOwfsIVdpPIKNt1uLeaBCdelPQ | f
1817819 | 1 | fake | YubxlSkJOvmQo1hkk5pA1q2mMK6T7cOdcU3ADUKZO8s3otEAbCdEcmm72IOxiBdaXSrw20Nq2Lb383lq230wYg | f
Results for LOWER LIKE
my_test_db=# EXPLAIN ANALYZE SELECT "books".* FROM "books" WHERE "books"."published" = 'f' and (LOWER(description) LIKE '%abcde%') ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..32420.14 rows=1600 width=117) (actual time=938.627..4114.038 rows=3 loops=1)
Filter: ((NOT published) AND (lower(description) ~~ '%abcde%'::text))
Rows Removed by Filter: 1000006
Total runtime: 4114.098 ms
Results for iLIKE
my_test_db=# EXPLAIN ANALYZE SELECT "books".* FROM "books" WHERE "books"."published" = 'f' and (description iLIKE '%abcde%') ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..29920.11 rows=100 width=117) (actual time=1147.612..4986.771 rows=3 loops=1)
Filter: ((NOT published) AND (description ~~* '%abcde%'::text))
Rows Removed by Filter: 1000006
Total runtime: 4986.831 ms
Database info disclosure
Postgres version:
my_test_db=# select version();
version
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.2.4 on x86_64-apple-darwin12.4.0, compiled by i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00), 64-bit
Collation setting:
my_test_db=# select datcollate from pg_database where datname = 'my_test_db';
datcollate
-------------
en_CA.UTF-8
Table definition:
my_test_db=# \d books
Table "public.books"
Column | Type | Modifiers
-------------+-----------------------------+-------------------------------------------------------
id | integer | not null default nextval('books_id_seq'::regclass)
user_id | integer | not null
title | character varying(255) | not null
description | text | not null default ''::text
published | boolean | not null default false
Indexes:
"books_pkey" PRIMARY KEY, btree (id)

In my rails Project. ILIKE is almost 10x faster then LOWER LIKE, I add a GIN index on entities.name column
> Entity.where("LOWER(name) LIKE ?", name.strip.downcase).limit(1).first
Entity Load (2443.9ms) SELECT "entities".* FROM "entities" WHERE (lower(name) like 'baidu') ORDER BY "entities"."id" ASC LIMIT $1 [["LIMIT", 1]]
> Entity.where("name ILIKE ?", name.strip).limit(1).first
Entity Load (285.0ms) SELECT "entities".* FROM "entities" WHERE (name ilike 'Baidu') ORDER BY "entities"."id" ASC LIMIT $1 [["LIMIT", 1]]
# explain analyze SELECT "entities".* FROM "entities" WHERE (name ilike 'Baidu') ORDER BY "entities"."id" ASC LIMIT 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3186.03..3186.04 rows=1 width=1588) (actual time=7.812..7.812 rows=1 loops=1)
-> Sort (cost=3186.03..3187.07 rows=414 width=1588) (actual time=7.811..7.811 rows=1 loops=1)
Sort Key: id
Sort Method: quicksort Memory: 26kB
-> Bitmap Heap Scan on entities (cost=1543.21..3183.96 rows=414 width=1588) (actual time=7.797..7.805 rows=1 loops=1)
Recheck Cond: ((name)::text ~~* 'Baidu'::text)
Rows Removed by Index Recheck: 6
Heap Blocks: exact=7
-> Bitmap Index Scan on index_entities_on_name (cost=0.00..1543.11 rows=414 width=0) (actual time=7.787..7.787 rows=7 loops=1)
Index Cond: ((name)::text ~~* 'Baidu'::text)
Planning Time: 6.375 ms
Execution Time: 7.874 ms
(12 rows)
GIN index is really helpful to improve ILIKE performance

Related

sysdate() causes Postgres to do ignore the Index and do a costly Sequential Scan

Anyone ever encounter this? Postgres Enterprise DB Advanced Server 11.5.12
sysdate() (Oracle proprietary) results in a Seq Scan of, in this case, 4,782 rows:
EXPLAIN SELECT p.id, p.practice
FROM PatientStatistics ps
INNER JOIN Patients p
ON p.id=ps.patient
WHERE ps.nextfutureapptdateservertime <= sysdate()
ORDER BY p.id ASC;
Hash Join (cost=799.81..1761.53 rows=4782 width=8)
Hash Cond: (p.id = ps.patient)
-> Index Only Scan using patients_index3 on patients p (cost=0.29..921.44 rows=15442 width=8)
-> Hash (cost=644.11..644.11 rows=4782 width=4)
-> Seq Scan on patientstatistics ps (cost=0.00..644.11 rows=4782 width=4)
Filter: (nextfutureapptdateservertime <= sysdate)
Changing to now() or current_timestamp (SQL Standard) fixes the issue. Postgres is correctly using the Index:
EXPLAIN SELECT p.id, p.practice
FROM PatientStatistics ps
INNER JOIN Patients p
ON p.id=ps.patient
WHERE ps.nextfutureapptdateservertime <= now()
ORDER BY p.id ASC;
Nested Loop (cost=0.57..51.41 rows=17 width=8)
-> Index Only Scan using "patientstatisti_idx$$_0c9a0048" on patientstatistics ps (cost=0.29..8.53 rows=17 width=4)
Index Cond: (nextfutureapptdateservertime <= now())
-> Index Scan using patients_pk on patients p (cost=0.29..2.52 rows=1 width=8)
Index Cond: (id = ps.patient)
Interesting to note the different in output of those functions:
SELECT now();
SELECT current_timestamp;
15-JAN-20 09:36:41.932741 -05:00
15-JAN-20 09:36:41.932930 -05:00
SELECT sysdate();
15-JAN-20 09:37:17
Perhaps Postgres's Date Indexes are hashed using Datetimes that have Decimal portion. The planner sees it was passed a date that doesn't have Decimal, and it knows the Index's Keys won't line up accurately, so it backs off to a Scan to ensure the query delivers 100% accurate results.
I could find nothing about this online after a 30-minute Googling.
I don't know EDB's proprietary fork, so the following is based on guesswork.
now() or (equivalently) current_timestamp is a STABLE function, so it returns the same value if it is evaluated more than once in the course of a statement execution (and indeed of a transaction).
The suspicion is that sysdate, like PostgreSQL's clock_timestamp(), is VOLATILE (returns the actual time).
Then the function can have a different value every time it is compared to a row, which makes it impossible to use an index scan.
If my suspicion is not correct, I'd call it an EDB bug.
I don't know how they implemented it, but this workaround functions correctly here:
CREATE OR REPLACE FUNCTION mysysdate(OUT timestamptz)
AS
$func$
select now();
$func$
language sql stable;
select mysysdate() ;
EXPLAIN select *
FROM public.feature_timeslice
WHERE valid_time_begin < mysysdate() - '10 year + 14 days'::interval;
select version() ;
\df+ mysysdate
Output:
CREATE FUNCTION
mysysdate
-------------------------------
2020-01-15 17:15:13.896497+01
(1 row)
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Index Scan using feature_timeslice_alt2 on feature_timeslice (cost=0.42..4474.84 rows=9206 width=28)
Index Cond: (valid_time_begin < (now() - '10 years 14 days'::interval))
(2 rows)
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 11.3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit
(1 row)
List of functions
Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Source code | Description
--------+-----------+--------------------------+------------------------------+------+------------+----------+----------+----------+-------------------+----------+---------------+-------------
tmp | mysysdate | timestamp with time zone | OUT timestamp with time zone | func | stable | unsafe | postgres | invoker | | sql | +|
| | | | | | | | | | | select now();+|
| | | | | | | | | | | |
(1 row)
Note: the granularity does not affect the query plan,
select date_trunc('sec', now());
also results in an indexscan.
Yup. It must be the Volatility thing. PG's docs on the matter. https://www.postgresql.org/docs/8.2/xfunc-volatility.html
They show "timeofday()" as an example of Volatile.
now() - STABLE - Time when this query began. Call it 6 times in the same query, it returns the same time.
timeofday() and sysdate() - VOLATILE - Time at the moment that time function() was called; not the query. It's like shelling out to the Operating System's date tool. Call it 6 times in the same query, you'll get 6 different times.

PostgreSQL pagination along with row number indexing

Okay, so I am working on query porting of an oracle DB to postgres. My query needs to give me numbered records, along with pagination.
Consider the following oracle code:
select * from (
select RS.*, ROWNUM as RN from (
select * from STUDENTS order by GRADES
) RS where ROWNUM <= (#{startIndex} + #{pageSize})
) where RN > #{startIndex}
Notice that there are 2 uses of ROWNUM here:
To provide a row number to each row in query result.
For pagination.
I need to port such a query to postgres.
I know how to paginate using LIMIT and OFFSET commands for pagination, but I am not able to provide a global row number (each row in query result gets a unique row number).
On the other hand, I was able to find ROW_NUMBER() command which can provide me with the global row numbers, but it is not reccommended for pagination purposes, since the number of tuples in my DB are very large.
How do I write a similar code in postgres?
The solution looks much simpler in PostgreSQL:
SELECT *,
row_number() OVER (ORDER BY grades, id) AS rn
FROM students
ORDER BY grades, id
OFFSET $1 LIMIT $2;
Here, id stands for the primary key and is used to disambiguate between equal grades.
That query is efficient if there is an index on grades and the offset is not too high:
EXPLAIN (ANALYZE)
SELECT *,
row_number() OVER (ORDER BY grades, id) AS rn
FROM students
ORDER BY grades, id
OFFSET 10 LIMIT 20;
QUERY PLAN
-------------------------------------------------------------------
Limit (cost=1.01..2.49 rows=20 width=20)
(actual time=0.204..0.365 rows=20 loops=1)
-> WindowAgg (cost=0.28..74.25 rows=1000 width=20)
(actual time=0.109..0.334 rows=30 loops=1)
-> Index Scan using students_grades_idx on students
(cost=0.28..59.25 rows=1000 width=12)
(actual time=0.085..0.204 rows=30 loops=1)
Planning time: 0.515 ms
Execution time: 0.627 ms
(5 rows)
Observe the actual values in the plan.
Pagination with OFFSET is always inefficient with large offsets; consider keyset pagination.

Why are indexed ORDER BY queries matching many rows a LOT faster than queries matching only a few?

Okay so I have the following query:
explain analyze SELECT seller_region FROM "products"
WHERE "products"."seller_region" = 'Bremen'
AND "products"."state" = 'active'
ORDER BY products.rank DESC,
products.score ASC NULLS LAST,
GREATEST(products.created_at, products.price_last_updated_at) DESC
LIMIT 14 OFFSET 0
The query filtering matches around 11.000 rows. If we look at the query planner, we can see that the query uses the index index_products_active_for_default_order and is very fast:
Limit (cost=0.43..9767.16 rows=14 width=36) (actual time=1.576..6.711 rows=14 loops=1)
-> Index Scan using index_products_active_for_default_order on products (cost=0.43..4951034.14 rows=7097 width=36) (actual time=1.576..6.709 rows=14 loops=1)
Filter: ((seller_region)::text = 'Bremen'::text)
Rows Removed by Filter: 3525
Total runtime: 6.724 ms
Now if I replace 'Bremen' with 'Sachsen' like so in the query:
explain analyze SELECT seller_region FROM "products"
WHERE "products"."seller_region" = 'Sachsen'
AND "products"."state" = 'active'
ORDER BY products.rank DESC,
products.score ASC NULLS LAST,
GREATEST(products.created_at, products.price_last_updated_at) DESC
LIMIT 14 OFFSET 0
The same query only matches around 70 rows and is now consistently very very slow, even though it uses the same index in the exact same way:
Limit (cost=0.43..1755.00 rows=14 width=36) (actual time=2.498..1831.737 rows=14 loops=1)
-> Index Scan using index_products_active_for_default_order on products (cost=0.43..4951034.14 rows=39505 width=36) (actual time=2.496..1831.727 rows=14 loops=1)
Filter: ((seller_region)::text = 'Sachsen'::text)
Rows Removed by Filter: 963360
Total runtime: 1831.760 ms
I don't understand why this happens? I would out of intuition think the the query matching more rows would be slower, but it's the other way around. I Have have tested this with other queries on other columns on my tables as well, and the phenomenon is the same. Two similar queries with the same ordering as the ones above, renders those that matches more rows 100's of times faster than those where the filtering only match a few. Why is this, and how can I avoid this behavior?
PS: I'm using postgres 9.3 and the index is defined as follows:
CREATE INDEX index_products_active_for_default_order
ON products
USING btree
(rank DESC, score COLLATE pg_catalog."default", (GREATEST(created_at, price_last_updated_at)) DESC)
WHERE state::text = 'active'::text;
That is because the first 14 matching rows for Bremen are found in the first 3539 index rows, while for Sachsen 963374 rows have to be scanned.
I recommend an index on (seller_region, rank).

Slow query time in postgresql when UTC milliseconds is stored as bigint

We are migrating from a time series database (ECHO historian) to a open source database basically due to price factor. Our choice was PostgreSQL as there are no open source time series database. What we used to store in the ECHO was just time and value pairs.
Now here is the problem. The table that I created in postgre consists of 2 columns. First is of "bigint" type to store the time in UTC milliseconds(13 digit number) and second is the value whose data type is set to "real" type. I had filled up around 3.6 million rows (Spread across a time range of 30 days) of data and when I query for a small time range (say 1 day) the query takes 4 seconds but for the same time range in ECHO the response time is 150 millisecs!.
This is a huge difference. Having a bigint for time seems to be the reason for the slowness but not sure. Could you please suggest how the query time can be improved.
I also read about using the data type "timestamp" and "timestamptz" and looks like we need to store the date and time as regular format and not UTC seconds. Can this help to speed up my query time?
Here is my table definition :
Table "public. MFC2 Flow_LCL "
Column | Type | Modifiers | Storage | Stats target | Description
----------+--------+-----------+---------+--------------+-------------
the_time | bigint | | plain | |
value | real | | plain | |
Indexes:
"MFC2 Flow_LCL _time_idx" btree (the_time)
Has OIDs: no
Currently i am storing the time in UTC milliseconds (using bigint). The challenge here is there could be duplicate time value pairs.
This is the query i am using (called through a simple API which will pass table name, start and end time)
PGresult *res;
int rec_count;
std::string sSQL;
sSQL.append("SELECT * FROM ");
sSQL.append(" \" ");
sSQL.append(table);
sSQL.append(" \" ");
sSQL.append(" WHERE");
sSQL.append(" time >= ");
CString sTime;
sTime.Format("%I64d",startTime);
sSQL.append(sTime);
sSQL.append(" AND time <= ");
CString eTime;
eTime.Format("%I64d",endTime);
sSQL.append(eTime);
sSQL.append(" ORDER BY time ");
res = PQexec(conn, sSQL.c_str());
Your time series database, if it works like a competitor I examined once, stores data in the order of the "time" column automatically in a heap-like structure. Postgres does not. As a result, you are doing an O(n) search [n=number of rows in table]: the entire table must be read to look for rows matching your time filter. A Primary Key on the timestamp (which creates a unique index) or, if timestamps are not unique, a regular index will give you binary O(log n) searches for single records and improved performance for all queries retrieving less than about 5% of the table. Postgres will estimate the crossover point between where an index scan or a full table scan is better.
You probably also want to CLUSTER (PG Docs) the table on that index.
Also, follow the advice above not to use time or other SQL reserved words as column names. Even when it is legal, it's asking for trouble.
[This would be better as a comment, but it is too long for that.]
Are you really planning for the year 2038 problem already? Why not just use an int for time as in standard UNIX?
SET search_path=tmp;
-- -------------------------------------------
-- create table and populate it with 10M rows
-- -------------------------------------------
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;
CREATE TABLE old_echo
( the_time timestamp NOT NULL PRIMARY KEY
, payload DOUBLE PRECISION NOT NULL
);
INSERT INTO old_echo (the_time, payload)
SELECT now() - (gs * interval '1 msec')
, random()
FROM generate_series(1,10000000) gs
;
-- DELETE FROM old_echo WHERE random() < 0.8;
VACUUM ANALYZE old_echo;
SELECT MIN(the_time) AS first
, MAX(the_time) AS last
, (MAX(the_time) - MIN(the_time))::interval AS width
FROM old_echo
;
EXPLAIN ANALYZE
SELECT *
FROM old_echo oe
JOIN (
SELECT MIN(the_time) AS first
, MAX(the_time) AS last
, (MAX(the_time) - MIN(the_time))::interval AS width
, ((MAX(the_time) - MIN(the_time))/2)::interval AS half
FROM old_echo
) mima ON 1=1
WHERE oe.the_time >= mima.first + mima.half
AND oe.the_time < mima.first + mima.half + '1 sec':: interval
;
RESULT:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.06..59433.67 rows=1111124 width=64) (actual time=0.101..1.307 rows=1000 loops=1)
-> Result (cost=0.06..0.07 rows=1 width=0) (actual time=0.049..0.050 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.03 rows=1 width=8) (actual time=0.022..0.022 rows=1 loops=1)
-> Index Scan using old_echo_pkey on old_echo (cost=0.00..284873.62 rows=10000115 width=8) (actual time=0.021..0.021 rows=1 loops=1)
Index Cond: (the_time IS NOT NULL)
InitPlan 2 (returns $1)
-> Limit (cost=0.00..0.03 rows=1 width=8) (actual time=0.009..0.010 rows=1 loops=1)
-> Index Scan Backward using old_echo_pkey on old_echo (cost=0.00..284873.62 rows=10000115 width=8) (actual time=0.009..0.009 rows=1 loops=1)
Index Cond: (the_time IS NOT NULL)
-> Index Scan using old_echo_pkey on old_echo oe (cost=0.01..34433.30 rows=1111124 width=16) (actual time=0.042..0.764 rows=1000 loops=1)
Index Cond: ((the_time >= (($0) + ((($1 - $0) / 2::double precision)))) AND (the_time < ((($0) + ((($1 - $0) / 2::double precision))) + '00:00:01'::interval)))
Total runtime: 1.504 ms
(13 rows)
UPDATE: since the timestamp appears to be non-unique (btw: what do duplicates mean in that case?) I added an extra key column. An ugly hack, but it works here. query time 11ms for 10M -80% rows. (number of rows hit 210/222067):
CREATE TABLE old_echo
( the_time timestamp NOT NULL
, the_seq SERIAL NOT NULL -- to catch the duplicate keys
, payload DOUBLE PRECISION NOT NULL
, PRIMARY KEY(the_time, the_seq)
);
-- Adding the random will cause some timestamps to be non-unique.
-- (and others to be non-existent)
INSERT INTO old_echo (the_time, payload)
SELECT now() - ((gs+random()*1000::integer) * interval '1 msec')
, random()
FROM generate_series(1,10000000) gs
;
DELETE FROM old_echo WHERE random() < 0.8;

PostgreSQL Simple JOIN very slow

I have a simple query, and two tables:
drilldown
CREATE SEQUENCE drilldown_id_seq;
CREATE TABLE drilldown (
transactionid bigint NOT NULL DEFAULT nextval('drilldown_id_seq'),
userid bigint NOT NULL default 0 REFERENCES users(id),
pathid bigint NOT NULL default 0,
reqms bigint NOT NULL default 0,
quems bigint NOT NULL default 0,
clicktime timestamp default current_timestamp,
PRIMARY KEY(transactionid)
);
ALTER SEQUENCE drilldown_id_seq OWNED BY drilldown.transactionid;
CREATE INDEX drilldown_idx1 ON drilldown (clicktime);
querystats
CREATE SEQUENCE querystats_id_seq;
CREATE TABLE querystats (
id bigint NOT NULL DEFAULT nextval('querystats_id_seq'),
transactionid bigint NOT NULL default 0 REFERENCES drilldown(transactionid),
querynameid bigint NOT NULL default 0 REFERENCES queryname(id),
queryms bigint NOT NULL default 0,
PRIMARY KEY(id)
);
ALTER SEQUENCE querystats_id_seq OWNED BY querystats.id;
CREATE INDEX querystats_idx1 ON querystats (transactionid);
CREATE INDEX querystats_idx2 ON querystats (querynameid);
drilldown has 1.5 million records, and querystats has 10 million records; the problem happens when I to a join between the two.
QUERY
explain analyse
select avg(qs.queryms)
from querystats qs
join drilldown d on (qs.transactionid=d.transactionid)
where querynameid=1;
QUERY PLAN
Aggregate (cost=528596.96..528596.97 rows=1 width=8) (actual time=5213.154..5213.154 rows=1 loops=1)
-> Hash Join (cost=274072.53..518367.59 rows=4091746 width=8) (actual time=844.087..3528.788 rows=4117717 loops=1)
Hash Cond: (qs.transactionid = d.transactionid)
-> Bitmap Heap Scan on querystats qs (cost=88732.62..210990.44 rows=4091746 width=16) (actual time=309.502..1321.029 rows=4117717 loops=1)
Recheck Cond: (querynameid = 1)
-> Bitmap Index Scan on querystats_idx2 (cost=0.00..87709.68 rows=4091746 width=0) (actual time=307.916..307.916 rows=4117718 loops=1)
Index Cond: (querynameid = 1)
-> Hash (cost=162842.29..162842.29 rows=1371250 width=8) (actual time=534.065..534.065 rows=1372574 loops=1)
Buckets: 4096 Batches: 64 Memory Usage: 850kB
-> Index Scan using drilldown_pkey on drilldown d (cost=0.00..162842.29 rows=1371250 width=8) (actual time=0.015..364.657 rows=1372574 loops=1)
Total runtime: 5213.205 ms
(11 rows)
I know there are some tuning parameters I can adjust for PostgreSQL, but what I want to know is the query I am doing the most optimal way of joing the two tables?
Or maybe some sort of INNER JOIN? I'm just not sure.
Any pointers are appreciated!
EDIT
database#\d drilldown
Table "public.drilldown"
Column | Type | Modifiers
---------------+-----------------------------+--------------------------------------------------------
transactionid | bigint | not null default nextval('drilldown_id_seq'::regclass)
userid | bigint | not null default 0
pathid | bigint | not null default 0
reqms | bigint | not null default 0
quems | bigint | not null default 0
clicktime | timestamp without time zone | default now()
Indexes:
"drilldown_pkey" PRIMARY KEY, btree (transactionid)
"drilldown_idx1" btree (clicktime)
Foreign-key constraints:
"drilldown_userid_fkey" FOREIGN KEY (userid) REFERENCES users(id)
Referenced by:
TABLE "querystats" CONSTRAINT "querystats_transactionid_fkey" FOREIGN KEY (transactionid) REFERENCES drilldown(transactionid)
database=# \d querystats
Table "public.querystats"
Column | Type | Modifiers
---------------+--------+---------------------------------------------------------
id | bigint | not null default nextval('querystats_id_seq'::regclass)
transactionid | bigint | not null default 0
querynameid | bigint | not null default 0
queryms | bigint | not null default 0
Indexes:
"querystats_pkey" PRIMARY KEY, btree (id)
"querystats_idx1" btree (transactionid)
"querystats_idx2" btree (querynameid)
Foreign-key constraints:
"querystats_querynameid_fkey" FOREIGN KEY (querynameid) REFERENCES queryname(id)
"querystats_transactionid_fkey" FOREIGN KEY (transactionid) REFERENCES drilldown(transactionid)
So here are the two tables requested and version
PostgreSQL 9.1.7 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit
So what this query is doing is getting the average from all the rows values of queryms for each query type (querynameid)
name | current_setting | source
----------------------------+----------------------------------+----------------------
application_name | psql | client
client_encoding | UTF8 | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
enable_seqscan | off | session
external_pid_file | /var/run/postgresql/9.1-main.pid | configuration file
lc_messages | en_US.UTF-8 | configuration file
lc_monetary | en_US.UTF-8 | configuration file
lc_numeric | en_US.UTF-8 | configuration file
lc_time | en_US.UTF-8 | configuration file
log_line_prefix | %t | configuration file
log_timezone | localtime | environment variable
max_connections | 100 | configuration file
max_stack_depth | 2MB | environment variable
port | 5432 | configuration file
shared_buffers | 24MB | configuration file
ssl | on | configuration file
TimeZone | localtime | environment variable
unix_socket_directory | /var/run/postgresql | configuration file
(19 rows)
I see that enable_seqscan=off, I have not touched any settings, this is a completely default install.
UPDATE
I made some changes from the below comments and here is the results.
explain analyse SELECT (SELECT avg(queryms) AS total FROM querystats WHERE querynameid=3) as total FROM querystats qs JOIN drilldown d ON (qs.transactionid=d.transactionid) WHERE qs.querynameid=3 limit 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=196775.99..196776.37 rows=1 width=0) (actual time=2320.876..2320.876 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Aggregate (cost=196775.94..196775.99 rows=1 width=8) (actual time=2320.815..2320.815 rows=1 loops=1)
-> Bitmap Heap Scan on querystats (cost=24354.25..189291.69 rows=2993698 width=8) (actual time=226.516..1144.690 rows=2999798 loops=1)
Recheck Cond: (querynameid = 3)
-> Bitmap Index Scan on querystats_idx (cost=0.00..23605.83 rows=2993698 width=0) (actual time=225.119..225.119 rows=2999798 loops=1)
Index Cond: (querynameid = 3)
-> Nested Loop (cost=0.00..1127817.12 rows=2993698 width=0) (actual time=2320.876..2320.876 rows=1 loops=1)
-> Seq Scan on drilldown d (cost=0.00..76745.10 rows=1498798 width=8) (actual time=0.009..0.009 rows=1 loops=1)
-> Index Scan using querystats_idx on querystats qs (cost=0.00..0.60 rows=2 width=8) (actual time=0.045..0.045 rows=1 loops=1)
Index Cond: ((querynameid = 3) AND (transactionid = d.transactionid))
Total runtime: 2320.940 ms
(12 rows)
It's behaving as though you have set enable_seqscan = off, because it is using an index scan to populate a hash table. Never set any of the planner options off except as a diagnostic step, and if you are showing a plan, please show any options used. This can be run to show a lot of the useful information:
SELECT version();
SELECT name, current_setting(name), source
FROM pg_settings
WHERE source NOT IN ('default', 'override');
It also helps if you tell us about the runtime environment, especially the amount of RAM on the machine, what your storage system looks like, and the size of the database (or even better, the active data set of frequently referenced data in the database).
As a rough breakdown, the 5.2 seconds breaks down to:
1.3 seconds to find the 4,117,717 querystats rows that match your selection criterion.
2.3 seconds to randomly match those against drilldown records.
1.6 seconds to pass the 4,117,717 rows and calculate an average.
So, even though you seem to have crippled its ability to use the fastest plan, it is taking only 1.26 microseconds (millionths of a second) to locate each row, join it to another, and work it into a calculation of an average. That's not too bad on an absolute basis, but you can almost certainly get a slightly faster plan.
First off, if you are using 9.2.x where x is less than 3, upgrade to 9.2.3 immediately. There was a performance regression for some types of plans which was fixed in the recent release which might affect this query. In general, try to stay up-to-date on minor releases (where version number changes past the second dot).
You can test different plans in a single session by setting planning factors on just that connection and running your query (or an EXPLAIN on it). Try something like this:
SET seq_page_cost = 0.1;
SET random_page_cost = 0.1;
SET cpu_tuple_cost = 0.05;
SET effective_cache_size = '3GB'; -- actually use shared_buffers plus OS cache
Make sure that all enable_ settings are on.
You claim in your question:
I see that enable_seqscan=off, I have not touched any settings, this is a completely default install.
In contrast, the output from pg_settings tells us:
enable_seqscan | off | session
Meaning, that you set enable_seqscan = off in your session. Something is not adding up here.
Run
SET enable_seqscan = on;
or
RESET enable_seqscan;
Assert:
SHOW enable_seqscan;
Also, your setting for shared_buffers is way too low for a db with millions of records. 24MB seems to be the conservative setting of Ubuntu out-of-the-box. You need to edit your configuration files for serious use! I quote the manual:
If you have a dedicated database server with 1GB or more of RAM, a
reasonable starting value for shared_buffers is 25% of the memory in your system.
So edit your postgresql.conf file to increase the value and reload.
Then try your query again and find out how enable_seqscan was turned off.
In this query
select avg(qs.queryms)
from querystats qs
join drilldown d
on (qs.transactionid=d.transactionid)
where querynameid=1;
you're not using any of the columns from the table "drilldown". Since the foreign key constraint guarantees there's a row in "drilldown" for every "transactionid" in "querystats", I don't think the join will do anything useful. Unless I've missed something, your query is equivalent to
select avg(qs.queryms)
from querystats qs
where querynameid=1;
No join at all. As long as there's an index on "querynameid" you should get decent performance.
When you don't join, avg(qs.queryms) executes once.
When you do the join, you are executing avg(qs.queryms) as many times as there are rows generated by the join.
If you're always interested in a single querynameid, try putting avg(qs.queryms) in a subselect:
SELECT
(SELECT avg(queryms) FROM querystats WHERE querynameid=1)
FROM querystats qs
JOIN drilldown d ON (qs.transactionid=d.transactionid)
WHERE qs.querynameid=1;
The querystats table looks like a fat junction table to me. In that case: omit the surrogate key, and live on the natural (composite) key (both components already are not NULLable) and add a reversed composite index. (the separate indices are useless, the FK constraint generates them automatically for you anyway)
-- CREATE SEQUENCE querystats_id_seq;
CREATE TABLE querystats (
-- id bigint NOT NULL DEFAULT nextval('querystats_id_seq'),
transactionid bigint NOT NULL default 0 REFERENCES drilldown(transactionid),
querynameid bigint NOT NULL default 0 REFERENCES queryname(id),
queryms bigint NOT NULL default 0,
PRIMARY KEY(transactionid,querynameid )
);
-- ALTER SEQUENCE querystats_id_seq OWNED BY querystats.id;
--CREATE INDEX querystats_idx1 ON querystats (transactionid);
-- CREATE INDEX querystats_idx2 ON querystats (querynameid);
CREATE UNIQUE INDEX querystats_alt ON querystats (querynameid, transactionid);

Resources