PostgreSQL Simple JOIN very slow - performance

I have a simple query, and two tables:
drilldown
CREATE SEQUENCE drilldown_id_seq;
CREATE TABLE drilldown (
transactionid bigint NOT NULL DEFAULT nextval('drilldown_id_seq'),
userid bigint NOT NULL default 0 REFERENCES users(id),
pathid bigint NOT NULL default 0,
reqms bigint NOT NULL default 0,
quems bigint NOT NULL default 0,
clicktime timestamp default current_timestamp,
PRIMARY KEY(transactionid)
);
ALTER SEQUENCE drilldown_id_seq OWNED BY drilldown.transactionid;
CREATE INDEX drilldown_idx1 ON drilldown (clicktime);
querystats
CREATE SEQUENCE querystats_id_seq;
CREATE TABLE querystats (
id bigint NOT NULL DEFAULT nextval('querystats_id_seq'),
transactionid bigint NOT NULL default 0 REFERENCES drilldown(transactionid),
querynameid bigint NOT NULL default 0 REFERENCES queryname(id),
queryms bigint NOT NULL default 0,
PRIMARY KEY(id)
);
ALTER SEQUENCE querystats_id_seq OWNED BY querystats.id;
CREATE INDEX querystats_idx1 ON querystats (transactionid);
CREATE INDEX querystats_idx2 ON querystats (querynameid);
drilldown has 1.5 million records, and querystats has 10 million records; the problem happens when I to a join between the two.
QUERY
explain analyse
select avg(qs.queryms)
from querystats qs
join drilldown d on (qs.transactionid=d.transactionid)
where querynameid=1;
QUERY PLAN
Aggregate (cost=528596.96..528596.97 rows=1 width=8) (actual time=5213.154..5213.154 rows=1 loops=1)
-> Hash Join (cost=274072.53..518367.59 rows=4091746 width=8) (actual time=844.087..3528.788 rows=4117717 loops=1)
Hash Cond: (qs.transactionid = d.transactionid)
-> Bitmap Heap Scan on querystats qs (cost=88732.62..210990.44 rows=4091746 width=16) (actual time=309.502..1321.029 rows=4117717 loops=1)
Recheck Cond: (querynameid = 1)
-> Bitmap Index Scan on querystats_idx2 (cost=0.00..87709.68 rows=4091746 width=0) (actual time=307.916..307.916 rows=4117718 loops=1)
Index Cond: (querynameid = 1)
-> Hash (cost=162842.29..162842.29 rows=1371250 width=8) (actual time=534.065..534.065 rows=1372574 loops=1)
Buckets: 4096 Batches: 64 Memory Usage: 850kB
-> Index Scan using drilldown_pkey on drilldown d (cost=0.00..162842.29 rows=1371250 width=8) (actual time=0.015..364.657 rows=1372574 loops=1)
Total runtime: 5213.205 ms
(11 rows)
I know there are some tuning parameters I can adjust for PostgreSQL, but what I want to know is the query I am doing the most optimal way of joing the two tables?
Or maybe some sort of INNER JOIN? I'm just not sure.
Any pointers are appreciated!
EDIT
database#\d drilldown
Table "public.drilldown"
Column | Type | Modifiers
---------------+-----------------------------+--------------------------------------------------------
transactionid | bigint | not null default nextval('drilldown_id_seq'::regclass)
userid | bigint | not null default 0
pathid | bigint | not null default 0
reqms | bigint | not null default 0
quems | bigint | not null default 0
clicktime | timestamp without time zone | default now()
Indexes:
"drilldown_pkey" PRIMARY KEY, btree (transactionid)
"drilldown_idx1" btree (clicktime)
Foreign-key constraints:
"drilldown_userid_fkey" FOREIGN KEY (userid) REFERENCES users(id)
Referenced by:
TABLE "querystats" CONSTRAINT "querystats_transactionid_fkey" FOREIGN KEY (transactionid) REFERENCES drilldown(transactionid)
database=# \d querystats
Table "public.querystats"
Column | Type | Modifiers
---------------+--------+---------------------------------------------------------
id | bigint | not null default nextval('querystats_id_seq'::regclass)
transactionid | bigint | not null default 0
querynameid | bigint | not null default 0
queryms | bigint | not null default 0
Indexes:
"querystats_pkey" PRIMARY KEY, btree (id)
"querystats_idx1" btree (transactionid)
"querystats_idx2" btree (querynameid)
Foreign-key constraints:
"querystats_querynameid_fkey" FOREIGN KEY (querynameid) REFERENCES queryname(id)
"querystats_transactionid_fkey" FOREIGN KEY (transactionid) REFERENCES drilldown(transactionid)
So here are the two tables requested and version
PostgreSQL 9.1.7 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit
So what this query is doing is getting the average from all the rows values of queryms for each query type (querynameid)
name | current_setting | source
----------------------------+----------------------------------+----------------------
application_name | psql | client
client_encoding | UTF8 | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
enable_seqscan | off | session
external_pid_file | /var/run/postgresql/9.1-main.pid | configuration file
lc_messages | en_US.UTF-8 | configuration file
lc_monetary | en_US.UTF-8 | configuration file
lc_numeric | en_US.UTF-8 | configuration file
lc_time | en_US.UTF-8 | configuration file
log_line_prefix | %t | configuration file
log_timezone | localtime | environment variable
max_connections | 100 | configuration file
max_stack_depth | 2MB | environment variable
port | 5432 | configuration file
shared_buffers | 24MB | configuration file
ssl | on | configuration file
TimeZone | localtime | environment variable
unix_socket_directory | /var/run/postgresql | configuration file
(19 rows)
I see that enable_seqscan=off, I have not touched any settings, this is a completely default install.
UPDATE
I made some changes from the below comments and here is the results.
explain analyse SELECT (SELECT avg(queryms) AS total FROM querystats WHERE querynameid=3) as total FROM querystats qs JOIN drilldown d ON (qs.transactionid=d.transactionid) WHERE qs.querynameid=3 limit 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=196775.99..196776.37 rows=1 width=0) (actual time=2320.876..2320.876 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Aggregate (cost=196775.94..196775.99 rows=1 width=8) (actual time=2320.815..2320.815 rows=1 loops=1)
-> Bitmap Heap Scan on querystats (cost=24354.25..189291.69 rows=2993698 width=8) (actual time=226.516..1144.690 rows=2999798 loops=1)
Recheck Cond: (querynameid = 3)
-> Bitmap Index Scan on querystats_idx (cost=0.00..23605.83 rows=2993698 width=0) (actual time=225.119..225.119 rows=2999798 loops=1)
Index Cond: (querynameid = 3)
-> Nested Loop (cost=0.00..1127817.12 rows=2993698 width=0) (actual time=2320.876..2320.876 rows=1 loops=1)
-> Seq Scan on drilldown d (cost=0.00..76745.10 rows=1498798 width=8) (actual time=0.009..0.009 rows=1 loops=1)
-> Index Scan using querystats_idx on querystats qs (cost=0.00..0.60 rows=2 width=8) (actual time=0.045..0.045 rows=1 loops=1)
Index Cond: ((querynameid = 3) AND (transactionid = d.transactionid))
Total runtime: 2320.940 ms
(12 rows)

It's behaving as though you have set enable_seqscan = off, because it is using an index scan to populate a hash table. Never set any of the planner options off except as a diagnostic step, and if you are showing a plan, please show any options used. This can be run to show a lot of the useful information:
SELECT version();
SELECT name, current_setting(name), source
FROM pg_settings
WHERE source NOT IN ('default', 'override');
It also helps if you tell us about the runtime environment, especially the amount of RAM on the machine, what your storage system looks like, and the size of the database (or even better, the active data set of frequently referenced data in the database).
As a rough breakdown, the 5.2 seconds breaks down to:
1.3 seconds to find the 4,117,717 querystats rows that match your selection criterion.
2.3 seconds to randomly match those against drilldown records.
1.6 seconds to pass the 4,117,717 rows and calculate an average.
So, even though you seem to have crippled its ability to use the fastest plan, it is taking only 1.26 microseconds (millionths of a second) to locate each row, join it to another, and work it into a calculation of an average. That's not too bad on an absolute basis, but you can almost certainly get a slightly faster plan.
First off, if you are using 9.2.x where x is less than 3, upgrade to 9.2.3 immediately. There was a performance regression for some types of plans which was fixed in the recent release which might affect this query. In general, try to stay up-to-date on minor releases (where version number changes past the second dot).
You can test different plans in a single session by setting planning factors on just that connection and running your query (or an EXPLAIN on it). Try something like this:
SET seq_page_cost = 0.1;
SET random_page_cost = 0.1;
SET cpu_tuple_cost = 0.05;
SET effective_cache_size = '3GB'; -- actually use shared_buffers plus OS cache
Make sure that all enable_ settings are on.

You claim in your question:
I see that enable_seqscan=off, I have not touched any settings, this is a completely default install.
In contrast, the output from pg_settings tells us:
enable_seqscan | off | session
Meaning, that you set enable_seqscan = off in your session. Something is not adding up here.
Run
SET enable_seqscan = on;
or
RESET enable_seqscan;
Assert:
SHOW enable_seqscan;
Also, your setting for shared_buffers is way too low for a db with millions of records. 24MB seems to be the conservative setting of Ubuntu out-of-the-box. You need to edit your configuration files for serious use! I quote the manual:
If you have a dedicated database server with 1GB or more of RAM, a
reasonable starting value for shared_buffers is 25% of the memory in your system.
So edit your postgresql.conf file to increase the value and reload.
Then try your query again and find out how enable_seqscan was turned off.

In this query
select avg(qs.queryms)
from querystats qs
join drilldown d
on (qs.transactionid=d.transactionid)
where querynameid=1;
you're not using any of the columns from the table "drilldown". Since the foreign key constraint guarantees there's a row in "drilldown" for every "transactionid" in "querystats", I don't think the join will do anything useful. Unless I've missed something, your query is equivalent to
select avg(qs.queryms)
from querystats qs
where querynameid=1;
No join at all. As long as there's an index on "querynameid" you should get decent performance.

When you don't join, avg(qs.queryms) executes once.
When you do the join, you are executing avg(qs.queryms) as many times as there are rows generated by the join.
If you're always interested in a single querynameid, try putting avg(qs.queryms) in a subselect:
SELECT
(SELECT avg(queryms) FROM querystats WHERE querynameid=1)
FROM querystats qs
JOIN drilldown d ON (qs.transactionid=d.transactionid)
WHERE qs.querynameid=1;

The querystats table looks like a fat junction table to me. In that case: omit the surrogate key, and live on the natural (composite) key (both components already are not NULLable) and add a reversed composite index. (the separate indices are useless, the FK constraint generates them automatically for you anyway)
-- CREATE SEQUENCE querystats_id_seq;
CREATE TABLE querystats (
-- id bigint NOT NULL DEFAULT nextval('querystats_id_seq'),
transactionid bigint NOT NULL default 0 REFERENCES drilldown(transactionid),
querynameid bigint NOT NULL default 0 REFERENCES queryname(id),
queryms bigint NOT NULL default 0,
PRIMARY KEY(transactionid,querynameid )
);
-- ALTER SEQUENCE querystats_id_seq OWNED BY querystats.id;
--CREATE INDEX querystats_idx1 ON querystats (transactionid);
-- CREATE INDEX querystats_idx2 ON querystats (querynameid);
CREATE UNIQUE INDEX querystats_alt ON querystats (querynameid, transactionid);

Related

sysdate() causes Postgres to do ignore the Index and do a costly Sequential Scan

Anyone ever encounter this? Postgres Enterprise DB Advanced Server 11.5.12
sysdate() (Oracle proprietary) results in a Seq Scan of, in this case, 4,782 rows:
EXPLAIN SELECT p.id, p.practice
FROM PatientStatistics ps
INNER JOIN Patients p
ON p.id=ps.patient
WHERE ps.nextfutureapptdateservertime <= sysdate()
ORDER BY p.id ASC;
Hash Join (cost=799.81..1761.53 rows=4782 width=8)
Hash Cond: (p.id = ps.patient)
-> Index Only Scan using patients_index3 on patients p (cost=0.29..921.44 rows=15442 width=8)
-> Hash (cost=644.11..644.11 rows=4782 width=4)
-> Seq Scan on patientstatistics ps (cost=0.00..644.11 rows=4782 width=4)
Filter: (nextfutureapptdateservertime <= sysdate)
Changing to now() or current_timestamp (SQL Standard) fixes the issue. Postgres is correctly using the Index:
EXPLAIN SELECT p.id, p.practice
FROM PatientStatistics ps
INNER JOIN Patients p
ON p.id=ps.patient
WHERE ps.nextfutureapptdateservertime <= now()
ORDER BY p.id ASC;
Nested Loop (cost=0.57..51.41 rows=17 width=8)
-> Index Only Scan using "patientstatisti_idx$$_0c9a0048" on patientstatistics ps (cost=0.29..8.53 rows=17 width=4)
Index Cond: (nextfutureapptdateservertime <= now())
-> Index Scan using patients_pk on patients p (cost=0.29..2.52 rows=1 width=8)
Index Cond: (id = ps.patient)
Interesting to note the different in output of those functions:
SELECT now();
SELECT current_timestamp;
15-JAN-20 09:36:41.932741 -05:00
15-JAN-20 09:36:41.932930 -05:00
SELECT sysdate();
15-JAN-20 09:37:17
Perhaps Postgres's Date Indexes are hashed using Datetimes that have Decimal portion. The planner sees it was passed a date that doesn't have Decimal, and it knows the Index's Keys won't line up accurately, so it backs off to a Scan to ensure the query delivers 100% accurate results.
I could find nothing about this online after a 30-minute Googling.
I don't know EDB's proprietary fork, so the following is based on guesswork.
now() or (equivalently) current_timestamp is a STABLE function, so it returns the same value if it is evaluated more than once in the course of a statement execution (and indeed of a transaction).
The suspicion is that sysdate, like PostgreSQL's clock_timestamp(), is VOLATILE (returns the actual time).
Then the function can have a different value every time it is compared to a row, which makes it impossible to use an index scan.
If my suspicion is not correct, I'd call it an EDB bug.
I don't know how they implemented it, but this workaround functions correctly here:
CREATE OR REPLACE FUNCTION mysysdate(OUT timestamptz)
AS
$func$
select now();
$func$
language sql stable;
select mysysdate() ;
EXPLAIN select *
FROM public.feature_timeslice
WHERE valid_time_begin < mysysdate() - '10 year + 14 days'::interval;
select version() ;
\df+ mysysdate
Output:
CREATE FUNCTION
mysysdate
-------------------------------
2020-01-15 17:15:13.896497+01
(1 row)
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Index Scan using feature_timeslice_alt2 on feature_timeslice (cost=0.42..4474.84 rows=9206 width=28)
Index Cond: (valid_time_begin < (now() - '10 years 14 days'::interval))
(2 rows)
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 11.3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit
(1 row)
List of functions
Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Source code | Description
--------+-----------+--------------------------+------------------------------+------+------------+----------+----------+----------+-------------------+----------+---------------+-------------
tmp | mysysdate | timestamp with time zone | OUT timestamp with time zone | func | stable | unsafe | postgres | invoker | | sql | +|
| | | | | | | | | | | select now();+|
| | | | | | | | | | | |
(1 row)
Note: the granularity does not affect the query plan,
select date_trunc('sec', now());
also results in an indexscan.
Yup. It must be the Volatility thing. PG's docs on the matter. https://www.postgresql.org/docs/8.2/xfunc-volatility.html
They show "timeofday()" as an example of Volatile.
now() - STABLE - Time when this query began. Call it 6 times in the same query, it returns the same time.
timeofday() and sysdate() - VOLATILE - Time at the moment that time function() was called; not the query. It's like shelling out to the Operating System's date tool. Call it 6 times in the same query, you'll get 6 different times.

postgresql hot updates table not working

I have a table:
CREATE TABLE my_table
(
id bigint NOT NULL,
data1 character varying(255),
data2 character varying(100000),
double1 double precision,
double2 double precision,
id2 bigint
);
With index on id2 (id2 is foreign key).
and i have a query:
update my_table set double2 = :param where id2 = :id2;
This query uses index on id2, but it works very-very slow.
I expected that my query will use HOT updates, but it is not true.
I checked HOT updates by query:
SELECT pg_stat_get_xact_tuples_hot_updated('my_table'::regclass::oid);
and it always returns zero.
What am I doing wrong? How i can speedup my update query?
Version of postgres is 9.4.11.
UPD:
execution plan for update:
Update on my_table (cost=0.56..97681.01 rows=34633 width=90) (actual time=42082.915..42082.915 rows=0 loops=1)
-> Index Scan using my_index on my_table (cost=0.56..97681.01 rows=34633 width=90) (actual time=0.110..330.563 rows=97128 loops=1)
Output: id, data1, data2, 0.5::double precision, double1, id2, ctid
Index Cond: (my_table.id2 = 379262689897216::bigint)
Planning time: 1.246 ms
Execution time: 42082.986 ms
The requirements for HOT updates are:
that you're updating only fields that aren't used in any indexes
that the page that contains the row you're updating has extra space in it (fillfactor should be less than 100)
which based on your comments, you seem to be doing.
But one thing I noticed is that you said you're using pg_stat_get_xact_tuples_hot_updated to check if HOT updates are happening; be aware that this function returns only the number of HOT-updated rows in the current transaction, not from all time. My guess is HOT updates are happening, but you used the wrong function to detect them. If instead you use
SELECT pg_stat_get_tuples_hot_updated('my_table'::regclass::oid);
you can get the total number of HOT-updated rows for all time.

LOWER LIKE vs iLIKE

How does the performance of the following two query components compare?
LOWER LIKE
... LOWER(description) LIKE '%abcde%' ...
iLIKE
... description iLIKE '%abcde%' ...
The answer depends on many factors like Postgres version, encoding and locale - LC_COLLATE in particular.
The bare expression lower(description) LIKE '%abc%' is typically a bit faster than description ILIKE '%abc%', and either is a bit faster than the equivalent regular expression: description ~* 'abc'. This matters for sequential scans where the expression has to be evaluated for every tested row.
But for big tables like you demonstrate in your answer one would certainly use an index. For arbitrary patterns (not only left-anchored) I suggest a trigram index using the additional module pg_trgm. Then we talk about milliseconds instead of seconds and the difference between the above expressions is nullified.
GIN and GiST indexes (using the gin_trgm_ops or gist_trgm_ops operator classes) support LIKE (~~), ILIKE (~~*), ~, ~* (and some more variants) alike. With a trigram GIN index on description (typically bigger than GiST, but faster for reads), your query would use description ILIKE 'case_insensitive_pattern'.
Related:
PostgreSQL LIKE query performance variations
Similar UTF-8 strings for autocomplete field
Basics for pattern matching in Postgres:
Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL
When working with said trigram index it's typically more practical to work with:
description ILIKE '%abc%'
Or with the case-insensitive regexp operator (without % wildcards):
description ~* 'abc'
An index on (description) does not support queries on lower(description) like:
lower(description) LIKE '%abc%'
And vice versa.
With predicates on lower(description) exclusively, the expression index is the slightly better option.
In all other cases, an index on (description) is preferable as it supports both case-sensitive and -insensitive predicates.
According to my tests (ten of each query), LOWER LIKE is about 17% faster than iLIKE.
Explanation
I created a million rows contain some random mixed text data:
require 'securerandom'
inserts = []
1000000.times do |i|
inserts << "(1, 'fake', '#{SecureRandom.urlsafe_base64(64)}')"
end
sql = "insert into books (user_id, title, description) values #{inserts.join(', ')}"
ActiveRecord::Base.connection.execute(sql)
Verify the number of rows:
my_test_db=# select count(id) from books ;
count
---------
1000009
(Yes, I have nine extra rows from other tests - not a problem.)
Example query and results:
my_test_db=# SELECT "books".* FROM "books" WHERE "books"."published" = 'f'
my_test_db=# and (LOWER(description) LIKE '%abcde%') ;
id | user_id | title | description | published
---------+---------+-------+----------------------------------------------------------------------------------------+------
1232322 | 1 | fake | 5WRGr7oCKABcdehqPKsUqV8ji61rsNGS1TX6pW5LJKrspOI_ttLNbaSyRz1BwTGQxp3OaxW7Xl6fzVpCu9y3fA | f
1487103 | 1 | fake | J6q0VkZ8-UlxIMZ_MFU_wsz_8MP3ZBQvkUo8-2INiDIp7yCZYoXqRyp1Lg7JyOwfsIVdpPIKNt1uLeaBCdelPQ | f
1817819 | 1 | fake | YubxlSkJOvmQo1hkk5pA1q2mMK6T7cOdcU3ADUKZO8s3otEAbCdEcmm72IOxiBdaXSrw20Nq2Lb383lq230wYg | f
Results for LOWER LIKE
my_test_db=# EXPLAIN ANALYZE SELECT "books".* FROM "books" WHERE "books"."published" = 'f' and (LOWER(description) LIKE '%abcde%') ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..32420.14 rows=1600 width=117) (actual time=938.627..4114.038 rows=3 loops=1)
Filter: ((NOT published) AND (lower(description) ~~ '%abcde%'::text))
Rows Removed by Filter: 1000006
Total runtime: 4114.098 ms
Results for iLIKE
my_test_db=# EXPLAIN ANALYZE SELECT "books".* FROM "books" WHERE "books"."published" = 'f' and (description iLIKE '%abcde%') ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..29920.11 rows=100 width=117) (actual time=1147.612..4986.771 rows=3 loops=1)
Filter: ((NOT published) AND (description ~~* '%abcde%'::text))
Rows Removed by Filter: 1000006
Total runtime: 4986.831 ms
Database info disclosure
Postgres version:
my_test_db=# select version();
version
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.2.4 on x86_64-apple-darwin12.4.0, compiled by i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00), 64-bit
Collation setting:
my_test_db=# select datcollate from pg_database where datname = 'my_test_db';
datcollate
-------------
en_CA.UTF-8
Table definition:
my_test_db=# \d books
Table "public.books"
Column | Type | Modifiers
-------------+-----------------------------+-------------------------------------------------------
id | integer | not null default nextval('books_id_seq'::regclass)
user_id | integer | not null
title | character varying(255) | not null
description | text | not null default ''::text
published | boolean | not null default false
Indexes:
"books_pkey" PRIMARY KEY, btree (id)
In my rails Project. ILIKE is almost 10x faster then LOWER LIKE, I add a GIN index on entities.name column
> Entity.where("LOWER(name) LIKE ?", name.strip.downcase).limit(1).first
Entity Load (2443.9ms) SELECT "entities".* FROM "entities" WHERE (lower(name) like 'baidu') ORDER BY "entities"."id" ASC LIMIT $1 [["LIMIT", 1]]
> Entity.where("name ILIKE ?", name.strip).limit(1).first
Entity Load (285.0ms) SELECT "entities".* FROM "entities" WHERE (name ilike 'Baidu') ORDER BY "entities"."id" ASC LIMIT $1 [["LIMIT", 1]]
# explain analyze SELECT "entities".* FROM "entities" WHERE (name ilike 'Baidu') ORDER BY "entities"."id" ASC LIMIT 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3186.03..3186.04 rows=1 width=1588) (actual time=7.812..7.812 rows=1 loops=1)
-> Sort (cost=3186.03..3187.07 rows=414 width=1588) (actual time=7.811..7.811 rows=1 loops=1)
Sort Key: id
Sort Method: quicksort Memory: 26kB
-> Bitmap Heap Scan on entities (cost=1543.21..3183.96 rows=414 width=1588) (actual time=7.797..7.805 rows=1 loops=1)
Recheck Cond: ((name)::text ~~* 'Baidu'::text)
Rows Removed by Index Recheck: 6
Heap Blocks: exact=7
-> Bitmap Index Scan on index_entities_on_name (cost=0.00..1543.11 rows=414 width=0) (actual time=7.787..7.787 rows=7 loops=1)
Index Cond: ((name)::text ~~* 'Baidu'::text)
Planning Time: 6.375 ms
Execution Time: 7.874 ms
(12 rows)
GIN index is really helpful to improve ILIKE performance

Postgres Hash Join speed [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I have 3 tables which I wish to join using inner joins in Postgres 9.1, reads, devices, and device_patients. Below is an abbreviated schema for each table.
reads -- ~250,000 rows
CREATE TABLE reads
(
id serial NOT NULL,
device_id integer NOT NULL,
value bigint NOT NULL,
read_datetime timestamp without time zone NOT NULL,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL,
CONSTRAINT reads_pkey PRIMARY KEY (id )
)
WITH (
OIDS=FALSE
);
ALTER TABLE reads
OWNER TO postgres;
CREATE INDEX index_reads_on_device_id
ON reads
USING btree
(device_id );
CREATE INDEX index_reads_on_read_datetime
ON reads
USING btree
(read_datetime );
devices -- ~500 rows
CREATE TABLE devices
(
id serial NOT NULL,
serial_number character varying(20) NOT NULL,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL,
CONSTRAINT devices_pkey PRIMARY KEY (id )
)
WITH (
OIDS=FALSE
);
ALTER TABLE devices
OWNER TO postgres;
CREATE UNIQUE INDEX index_devices_on_serial_number
ON devices
USING btree
(serial_number COLLATE pg_catalog."default" );
patient_devices -- ~25,000 rows
CREATE TABLE patient_devices
(
id serial NOT NULL,
patient_id integer NOT NULL,
device_id integer NOT NULL,
issuance_datetime timestamp without time zone NOT NULL,
unassignment_datetime timestamp without time zone,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL,
CONSTRAINT patient_devices_pkey PRIMARY KEY (id )
)
WITH (
OIDS=FALSE
);
ALTER TABLE patient_devices
OWNER TO postgres;
CREATE INDEX index_patient_devices_on_device_id
ON patient_devices
USING btree
(device_id );
CREATE INDEX index_patient_devices_on_issuance_datetime
ON patient_devices
USING btree
(issuance_datetime );
CREATE INDEX index_patient_devices_on_patient_id
ON patient_devices
USING btree
(patient_id );
CREATE INDEX index_patient_devices_on_unassignment_datetime
ON patient_devices
USING btree
(unassignment_datetime );
patients -- ~1,000 rows
CREATE TABLE patients
(
id serial NOT NULL,
first_name character varying(50) NOT NULL,
middle_name character varying(50),
last_name character varying(50) NOT NULL,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL,
CONSTRAINT participants_pkey PRIMARY KEY (id )
)
WITH (
OIDS=FALSE
);
ALTER TABLE patients
OWNER TO postgres;
Here is my abbreviated query.
SELECT device_patients.patient_id, serial_number FROM reads
INNER JOIN devices ON devices.id = reads.device_id
INNER JOIN patient_devices ON device_patients.device_id = devices.id
WHERE (reads.read_datetime BETWEEN '2012-01-01 10:30:01.000000' AND '2013-05-18 03:03:42')
AND (read_datetime > issuance_datetime) AND ((unassignment_datetime IS NOT NULL AND read_datetime < unassignment_datetime) OR
(unassignment_datetime IS NULL))
GROUP BY serial_number, patient_devices.patient_id LIMIT 10
Ultimately this will be a small part of a larger query (without the LIMIT, I only added the limit to prove to myself that the long runtime was not due to returning a bunch of rows), however I've done a bunch of experimenting and determined that this is the slow part of the larger query. When I run EXPLAIN ANALYZE on this query I get the following output (also viewable here)
Limit (cost=156442.31..156442.41 rows=10 width=13) (actual time=2815.435..2815.441 rows=10 loops=1)
-> HashAggregate (cost=156442.31..159114.89 rows=267258 width=13) (actual time=2815.432..2815.437 rows=10 loops=1)
-> Hash Join (cost=1157.78..151455.79 rows=997304 width=13) (actual time=30.930..2739.164 rows=250150 loops=1)
Hash Cond: (devices.device_id = devices.id)
Join Filter: ((reads.read_datetime > patient_devices.issuance_datetime) AND (((patient_devices.unassignment_datetime IS NOT NULL) AND (reads.read_datetime < patient_devices.unassignment_datetime)) OR (patient_devices.unassignment_datetime IS NULL)))
-> Seq Scan on reads (cost=0.00..7236.94 rows=255396 width=12) (actual time=0.035..64.433 rows=255450 loops=1)
Filter: ((read_datetime >= '2012-01-01 10:30:01'::timestamp without time zone) AND (read_datetime <= '2013-05-18 03:03:42'::timestamp without time zone))
-> Hash (cost=900.78..900.78 rows=20560 width=37) (actual time=30.830..30.830 rows=25015 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 1755kB
-> Hash Join (cost=19.90..900.78 rows=20560 width=37) (actual time=0.776..20.551 rows=25015 loops=1)
Hash Cond: (patient_devices.device_id = devices.id)
-> Seq Scan on patient_devices (cost=0.00..581.93 rows=24893 width=24) (actual time=0.014..7.867 rows=25545 loops=1)
Filter: ((unassignment_datetime IS NOT NULL) OR (unassignment_datetime IS NULL))
-> Hash (cost=13.61..13.61 rows=503 width=13) (actual time=0.737..0.737 rows=503 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 24kB
-> Seq Scan on devices (cost=0.00..13.61 rows=503 width=13) (actual time=0.016..0.466 rows=503 loops=1)
Filter: (entity_id = 2)
Total runtime: 2820.392 ms
My question is how do I speed this up? Right now I'm running this on my Windows machine for testing, but ultimately it will be deployed on Ubuntu, will that make a difference? Any insight into why this takes 2 seconds would be greatly appreciated.
Thanks
It has been suggested that the LIMIT might be altering the query plan. Here is the same query without the LIMIT. The slow part still appears to be the Hash Join.
Also, here are the relevant tuning parameters. Again I'm only testing this on Windows now, and I don't know what effect this would have on a Linux machine
shared_buffers = 2GB
effective_cache_size = 4GB
work_mem = 256MB
random_page_cost = 2.0
Here are the statistics for the reads table
Statistic Value
Sequential Scans 130
Sequential Tuples Read 28865850
Index Scans 283630
Index Tuples Fetched 141421907
Tuples Inserted 255450
Tuples Updated 0
Tuples Deleted 0
Tuples HOT Updated 0
Live Tuples 255450
Dead Tuples 0
Heap Blocks Read 20441
Heap Blocks Hit 3493033
Index Blocks Read 8824
Index Blocks Hit 4840210
Toast Blocks Read
Toast Blocks Hit
Toast Index Blocks Read
Toast Index Blocks Hit
Last Vacuum 2013-05-20 09:23:03.782-07
Last Autovacuum
Last Analyze 2013-05-20 09:23:03.91-07
Last Autoanalyze 2013-05-17 19:01:44.075-07
Vacuum counter 1
Autovacuum counter 0
Analyze counter 1
Autoanalyze counter 6
Table Size 27 MB
Toast Table Size none
Indexes Size 34 MB
Here are the statistics for the devices table
Statistic Value
Sequential Scans 119
Sequential Tuples Read 63336
Index Scans 1053935
Index Tuples Fetched 1053693
Tuples Inserted 609
Tuples Updated 0
Tuples Deleted 0
Tuples HOT Updated 0
Live Tuples 609
Dead Tuples 0
Heap Blocks Read 32
Heap Blocks Hit 1054553
Index Blocks Read 32
Index Blocks Hit 2114305
Toast Blocks Read
Toast Blocks Hit
Toast Index Blocks Read
Toast Index Blocks Hit
Last Vacuum
Last Autovacuum
Last Analyze
Last Autoanalyze 2013-05-17 19:02:49.692-07
Vacuum counter 0
Autovacuum counter 0
Analyze counter 0
Autoanalyze counter 2
Table Size 48 kB
Toast Table Size none
Indexes Size 128 kB
Here are the statistics for the patient_devices table
Statistic Value
Sequential Scans 137
Sequential Tuples Read 3065400
Index Scans 853990
Index Tuples Fetched 46143763
Tuples Inserted 25545
Tuples Updated 24936
Tuples Deleted 0
Tuples HOT Updated 0
Live Tuples 25547
Dead Tuples 929
Heap Blocks Read 1959
Heap Blocks Hit 6099617
Index Blocks Read 1077
Index Blocks Hit 2462681
Toast Blocks Read
Toast Blocks Hit
Toast Index Blocks Read
Toast Index Blocks Hit
Last Vacuum
Last Autovacuum 2013-05-17 19:01:44.576-07
Last Analyze
Last Autoanalyze 2013-05-17 19:01:44.697-07
Vacuum counter 0
Autovacuum counter 6
Analyze counter 0
Autoanalyze counter 6
Table Size 2624 kB
Toast Table Size none
Indexes Size 5312 kB
Below is the full query that I'm trying to speed up. The smaller query is indeed faster, but I was unable to make my full query faster which is reproduced below. As suggested, I added 4 new indices, UNIQUE(device_id, issuance_datetime), UNIQUE(device_id, issuance_datetime), UNIQUE(patient_id, unassignment_datetime), UNIQUE(patient_id, unassignment_datetime)
SELECT
first_name
, last_name
, MAX(max_read) AS read_datetime
, SUM(value) AS value
, serial_number
FROM (
SELECT
pa.first_name
, pa.last_name
, value
, first_value(de.serial_number) OVER(PARTITION BY pa.id ORDER BY re.read_datetime DESC) AS serial_number -- I'm not sure if this is a good way to do this, but I don't know of another way
, re.read_datetime
, MAX(re.read_datetime) OVER (PARTITION BY pd.id) AS max_read
FROM reads re
INNER JOIN devices de ON de.id = re.device_id
INNER JOIN patient_devices pd ON pd.device_id = de.id
AND re.read_datetime >= pd.issuance_datetime
AND re.read_datetime < COALESCE(pd.unassignment_datetime , 'infinity'::timestamp)
INNER JOIN patients pa ON pa.id = pd.patient_id
WHERE re.read_datetime BETWEEN '2012-01-01 10:30:01' AND '2013-05-18 03:03:42'
) AS foo WHERE read_datetime = max_read
GROUP BY first_name, last_name, serial_number ORDER BY value desc
LIMIT 10
Sorry for not posting this earlier, but I thought this query would be too complicated, and was trying to simply the problem, but apparently I still can't figure it out. It seems like it would be a LOT quicker if I could limit the results returned by the nested select using the max_read variable, but according to numerous sources, that isn't allowed in Postgres.
FYI: sanitised query:
SELECT pd.patient_id
, de.serial_number
FROM reads re
INNER JOIN devices de ON de.id = re.device_id
INNER JOIN patient_devices pd ON pd.device_id = de.id
AND re.read_datetime >= pd.issuance_datetime -- changed this from '>' to '>='
AND (re.read_datetime < pd.unissuance_datetime OR pd.unissuance_datetime IS NULL)
WHERE re.read_datetime BETWEEN '2012-01-01 10:30:01.000000' AND '2013-05-18 03:03:42'
GROUP BY de.serial_number, pd.patient_id
LIMIT 10
;
UPDATE: without the original typos:
EXPLAIN ANALYZE
SELECT pd.patient_id
, de.serial_number
FROM reads re
INNER JOIN devices de ON de.id = re.device_id
INNER JOIN patient_devices pd ON pd.device_id = de.id
AND re.read_datetime >= pd.issuance_datetime
AND (re.read_datetime < pd.unassignment_datetime OR pd.unassignment_datetime IS NULL)
WHERE re.read_datetime BETWEEN '2012-01-01 10:30:01.000000' AND '2013-05-18 03:03:42'
GROUP BY de.serial_number, pd.patient_id
LIMIT 10
;
UPDATE: this is about 6 times as fast here (on synthetic data, and with a slightly altered data model)
-- Modified data model + synthetic data:
CREATE TABLE devices
( id serial NOT NULL
, serial_number character varying(20) NOT NULL
-- , created_at timestamp without time zone NOT NULL
-- , updated_at timestamp without time zone NOT NULL
, CONSTRAINT devices_pkey PRIMARY KEY (id )
, UNIQUE (serial_number)
) ;
CREATE TABLE reads
-- ( id serial NOT NULL PRIMARY KEY -- You don't need this surrogate key
( device_id integer NOT NULL REFERENCES devices (id)
, value bigint NOT NULL
, read_datetime timestamp without time zone NOT NULL
-- , created_at timestamp without time zone NOT NULL
-- , updated_at timestamp without time zone NOT NULL
, PRIMARY KEY ( device_id, read_datetime)
) ;
CREATE TABLE patient_devices
-- ( id serial NOT NULL PRIMARY KEY -- You don't need this surrogate key
( patient_id integer NOT NULL -- REFERENCES patients (id)
, device_id integer NOT NULL REFERENCES devices(id)
, issuance_datetime timestamp without time zone NOT NULL
, unassignment_datetime timestamp without time zone
-- , created_at timestamp without time zone NOT NULL
-- , updated_at timestamp without time zone NOT NULL
, PRIMARY KEY (device_id, issuance_datetime)
, UNIQUE (device_id, unassignment_datetime)
) ;
-- CREATE INDEX index_patient_devices_on_issuance_datetime ON patient_devices (device_id, unassignment_datetime );
-- may need some additional indices later
-- devices -- ~500 rows
INSERT INTO devices(serial_number) SELECT 'No_' || gs::text FROM generate_series(1,500) gs;
-- reads -- ~100K rows
INSERT INTO reads(device_id, read_datetime, value)
SELECT de.id, gs
, (random()*1000000)::bigint
FROM devices de
JOIN generate_series('2012-01-01', '2013-05-01' , '1 hour' ::interval) gs
ON random() < 0.02;
-- patient_devices -- ~25,000 rows
INSERT INTO patient_devices(device_id, issuance_datetime, patient_id)
SELECT DISTINCT ON (re.device_id, read_datetime)
re.device_id, read_datetime, pa
FROM generate_series(1,100) pa
JOIN reads re
ON random() < 0.01;
-- close the open intervals
UPDATE patient_devices dst
SET unassignment_datetime = src.issuance_datetime
FROM patient_devices src
WHERE src.device_id = dst.device_id
AND src.issuance_datetime > dst.issuance_datetime
AND NOT EXISTS ( SELECT *
FROM patient_devices nx
WHERE nx.device_id = src.device_id
AND nx.issuance_datetime > dst.issuance_datetime
AND nx.issuance_datetime < src.issuance_datetime
)
;
VACUUM ANALYZE patient_devices;
VACUUM ANALYZE devices;
VACUUM ANALYZE reads;
-- EXPLAIN ANALYZE
SELECT pd.patient_id
, de.serial_number
--, COUNT (*) AS zcount
FROM reads re
INNER JOIN devices de ON de.id = re.device_id
INNER JOIN patient_devices pd ON pd.device_id = de.id
AND re.read_datetime >= pd.issuance_datetime
AND re.read_datetime < COALESCE(pd.unassignment_datetime , 'infinity'::timestamp)
WHERE re.read_datetime BETWEEN '2012-01-01 10:30:01' AND '2013-05-18 03:03:42'
GROUP BY de.serial_number, pd.patient_id
LIMIT 10
;
look at the parts of the analyze report where you see Seq Scan.
for example this parts could use some indexes:
Seq Scan on patient_devices - > unassignment_datetime
Seq Scan on devices -> entity_id
Seq Scan on reads - > read_datetime
about read_datetime: it is possible to create a specific index for mathematical equation like > and <=, which will come handy. i dont know the syntax for it, though

Slow query time in postgresql when UTC milliseconds is stored as bigint

We are migrating from a time series database (ECHO historian) to a open source database basically due to price factor. Our choice was PostgreSQL as there are no open source time series database. What we used to store in the ECHO was just time and value pairs.
Now here is the problem. The table that I created in postgre consists of 2 columns. First is of "bigint" type to store the time in UTC milliseconds(13 digit number) and second is the value whose data type is set to "real" type. I had filled up around 3.6 million rows (Spread across a time range of 30 days) of data and when I query for a small time range (say 1 day) the query takes 4 seconds but for the same time range in ECHO the response time is 150 millisecs!.
This is a huge difference. Having a bigint for time seems to be the reason for the slowness but not sure. Could you please suggest how the query time can be improved.
I also read about using the data type "timestamp" and "timestamptz" and looks like we need to store the date and time as regular format and not UTC seconds. Can this help to speed up my query time?
Here is my table definition :
Table "public. MFC2 Flow_LCL "
Column | Type | Modifiers | Storage | Stats target | Description
----------+--------+-----------+---------+--------------+-------------
the_time | bigint | | plain | |
value | real | | plain | |
Indexes:
"MFC2 Flow_LCL _time_idx" btree (the_time)
Has OIDs: no
Currently i am storing the time in UTC milliseconds (using bigint). The challenge here is there could be duplicate time value pairs.
This is the query i am using (called through a simple API which will pass table name, start and end time)
PGresult *res;
int rec_count;
std::string sSQL;
sSQL.append("SELECT * FROM ");
sSQL.append(" \" ");
sSQL.append(table);
sSQL.append(" \" ");
sSQL.append(" WHERE");
sSQL.append(" time >= ");
CString sTime;
sTime.Format("%I64d",startTime);
sSQL.append(sTime);
sSQL.append(" AND time <= ");
CString eTime;
eTime.Format("%I64d",endTime);
sSQL.append(eTime);
sSQL.append(" ORDER BY time ");
res = PQexec(conn, sSQL.c_str());
Your time series database, if it works like a competitor I examined once, stores data in the order of the "time" column automatically in a heap-like structure. Postgres does not. As a result, you are doing an O(n) search [n=number of rows in table]: the entire table must be read to look for rows matching your time filter. A Primary Key on the timestamp (which creates a unique index) or, if timestamps are not unique, a regular index will give you binary O(log n) searches for single records and improved performance for all queries retrieving less than about 5% of the table. Postgres will estimate the crossover point between where an index scan or a full table scan is better.
You probably also want to CLUSTER (PG Docs) the table on that index.
Also, follow the advice above not to use time or other SQL reserved words as column names. Even when it is legal, it's asking for trouble.
[This would be better as a comment, but it is too long for that.]
Are you really planning for the year 2038 problem already? Why not just use an int for time as in standard UNIX?
SET search_path=tmp;
-- -------------------------------------------
-- create table and populate it with 10M rows
-- -------------------------------------------
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;
CREATE TABLE old_echo
( the_time timestamp NOT NULL PRIMARY KEY
, payload DOUBLE PRECISION NOT NULL
);
INSERT INTO old_echo (the_time, payload)
SELECT now() - (gs * interval '1 msec')
, random()
FROM generate_series(1,10000000) gs
;
-- DELETE FROM old_echo WHERE random() < 0.8;
VACUUM ANALYZE old_echo;
SELECT MIN(the_time) AS first
, MAX(the_time) AS last
, (MAX(the_time) - MIN(the_time))::interval AS width
FROM old_echo
;
EXPLAIN ANALYZE
SELECT *
FROM old_echo oe
JOIN (
SELECT MIN(the_time) AS first
, MAX(the_time) AS last
, (MAX(the_time) - MIN(the_time))::interval AS width
, ((MAX(the_time) - MIN(the_time))/2)::interval AS half
FROM old_echo
) mima ON 1=1
WHERE oe.the_time >= mima.first + mima.half
AND oe.the_time < mima.first + mima.half + '1 sec':: interval
;
RESULT:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.06..59433.67 rows=1111124 width=64) (actual time=0.101..1.307 rows=1000 loops=1)
-> Result (cost=0.06..0.07 rows=1 width=0) (actual time=0.049..0.050 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.03 rows=1 width=8) (actual time=0.022..0.022 rows=1 loops=1)
-> Index Scan using old_echo_pkey on old_echo (cost=0.00..284873.62 rows=10000115 width=8) (actual time=0.021..0.021 rows=1 loops=1)
Index Cond: (the_time IS NOT NULL)
InitPlan 2 (returns $1)
-> Limit (cost=0.00..0.03 rows=1 width=8) (actual time=0.009..0.010 rows=1 loops=1)
-> Index Scan Backward using old_echo_pkey on old_echo (cost=0.00..284873.62 rows=10000115 width=8) (actual time=0.009..0.009 rows=1 loops=1)
Index Cond: (the_time IS NOT NULL)
-> Index Scan using old_echo_pkey on old_echo oe (cost=0.01..34433.30 rows=1111124 width=16) (actual time=0.042..0.764 rows=1000 loops=1)
Index Cond: ((the_time >= (($0) + ((($1 - $0) / 2::double precision)))) AND (the_time < ((($0) + ((($1 - $0) / 2::double precision))) + '00:00:01'::interval)))
Total runtime: 1.504 ms
(13 rows)
UPDATE: since the timestamp appears to be non-unique (btw: what do duplicates mean in that case?) I added an extra key column. An ugly hack, but it works here. query time 11ms for 10M -80% rows. (number of rows hit 210/222067):
CREATE TABLE old_echo
( the_time timestamp NOT NULL
, the_seq SERIAL NOT NULL -- to catch the duplicate keys
, payload DOUBLE PRECISION NOT NULL
, PRIMARY KEY(the_time, the_seq)
);
-- Adding the random will cause some timestamps to be non-unique.
-- (and others to be non-existent)
INSERT INTO old_echo (the_time, payload)
SELECT now() - ((gs+random()*1000::integer) * interval '1 msec')
, random()
FROM generate_series(1,10000000) gs
;
DELETE FROM old_echo WHERE random() < 0.8;

Resources