BUG #15779: Partition elimination doesn't work as expected when using PARTITION BY RANGE

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: damir(dot)ciganovic(dot)jankovic(at)gmail(dot)com
Subject: BUG #15779: Partition elimination doesn't work as expected when using PARTITION BY RANGE
Date: 2019-04-25 11:33:38
Message-ID: 15779-fd2fb982346b7a22@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15779
Logged by: Damir Ciganović-Janković
Email address: damir(dot)ciganovic(dot)jankovic(at)gmail(dot)com
PostgreSQL version: 11.2
Operating system: Windows, Centos 6/7
Description:

We have an issue that our queries are slow due to bad partition elimination
when executing queries for our partitioned tables, here is an example, I
will put all explain query outputs later for each explain separately:

------------

DROP TABLE IF EXISTS test;
DROP TYPE IF EXISTS test_enum;

CREATE TYPE test_enum AS ENUM ('FIRST', 'SECOND');
CREATE TABLE test (enum_col test_enum, timestamp_col timestamp, counter int)
PARTITION BY RANGE (enum_col, timestamp_col);

CREATE TABLE test_FIRST_1 PARTITION OF test FOR VALUES FROM ('FIRST',
'2019-01-01') TO ('FIRST', '2019-01-02');
CREATE TABLE test_FIRST_2 PARTITION OF test FOR VALUES FROM ('FIRST',
'2019-01-02') TO ('FIRST', '2019-01-03');

CREATE TABLE test_SECOND_1 PARTITION OF test FOR VALUES FROM ('SECOND',
'2019-01-01') TO ('SECOND', '2019-01-02');
CREATE TABLE test_SECOND_2 PARTITION OF test FOR VALUES FROM ('SECOND',
'2019-01-02') TO ('SECOND', '2019-01-03');

EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01';
EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01' and
enum_col = 'FIRST';
EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01' and
enum_col IN ('FIRST', 'SECOND');

------------

This is what we get for first explain:
EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01';

Append (cost=0.00..132.68 rows=36 width=16) (actual time=0.044..0.044
rows=0 loops=1)
-> Seq Scan on test_first_1 (cost=0.00..33.13 rows=9 width=16) (actual
time=0.017..0.017 rows=0 loops=1)
Filter: (timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone)
-> Seq Scan on test_first_2 (cost=0.00..33.13 rows=9 width=16) (actual
time=0.010..0.010 rows=0 loops=1)
Filter: (timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone)
-> Seq Scan on test_second_1 (cost=0.00..33.13 rows=9 width=16) (actual
time=0.008..0.008 rows=0 loops=1)
Filter: (timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone)
-> Seq Scan on test_second_2 (cost=0.00..33.13 rows=9 width=16) (actual
time=0.007..0.007 rows=0 loops=1)
Filter: (timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone)

As we can see, we got all partitions despite setting timestamp. When we add
an enum value, partition pruning is fine:
EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01' and
enum_col = 'FIRST';

Append (cost=0.00..37.76 rows=1 width=16) (actual time=0.015..0.015 rows=0
loops=1)
-> Seq Scan on test_first_1 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.014..0.014 rows=0 loops=1)
Filter: ((timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone) AND (enum_col = 'FIRST'::test_enum))

Another bug here is when we put both enum values in IN clause, we still go
through all partitions despite timestamp targeting only _1 partitions:
EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01' and
enum_col IN ('FIRST', 'SECOND');

Append (cost=0.00..151.02 rows=4 width=16) (actual time=0.083..0.083
rows=0 loops=1)
-> Seq Scan on test_first_1 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.045..0.045 rows=0 loops=1)
Filter: ((enum_col = ANY ('{FIRST,SECOND}'::test_enum[])) AND
(timestamp_col = '2019-01-01 00:00:00'::timestamp without time zone))
-> Seq Scan on test_first_2 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.012..0.012 rows=0 loops=1)
Filter: ((enum_col = ANY ('{FIRST,SECOND}'::test_enum[])) AND
(timestamp_col = '2019-01-01 00:00:00'::timestamp without time zone))
-> Seq Scan on test_second_1 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.012..0.012 rows=0 loops=1)
Filter: ((enum_col = ANY ('{FIRST,SECOND}'::test_enum[])) AND
(timestamp_col = '2019-01-01 00:00:00'::timestamp without time zone))
-> Seq Scan on test_second_2 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.011..0.011 rows=0 loops=1)
Filter: ((enum_col = ANY ('{FIRST,SECOND}'::test_enum[])) AND
(timestamp_col = '2019-01-01 00:00:00'::timestamp without time zone))

We could fix this by adding check constraints. But I think that it should
work without adding constraints, that's why it seems like a bug to me.

------------

ALTER TABLE test_FIRST_1 ADD CONSTRAINT ck_enum_col CHECK(enum_col = 'FIRST'
AND timestamp_col >='2019-01-01' AND timestamp_col < '2019-01-02');
ALTER TABLE test_FIRST_2 ADD CONSTRAINT ck_enum_col CHECK(enum_col = 'FIRST'
AND timestamp_col >='2019-01-02' AND timestamp_col < '2019-01-03');
ALTER TABLE test_SECOND_1 ADD CONSTRAINT ck_enum_col CHECK(enum_col =
'SECOND' AND timestamp_col >='2019-01-01' AND timestamp_col <
'2019-01-02');
ALTER TABLE test_SECOND_2 ADD CONSTRAINT ck_enum_col CHECK(enum_col =
'SECOND' AND timestamp_col >='2019-01-02' AND timestamp_col <
'2019-01-03');

-- same explains

EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01';
EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01' and
enum_col = 'FIRST';
EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01' and
enum_col IN ('FIRST', 'SECOND');

------------

EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01';

Append (cost=0.00..66.34 rows=18 width=16) (actual time=0.034..0.034
rows=0 loops=1)
-> Seq Scan on test_first_1 (cost=0.00..33.13 rows=9 width=16) (actual
time=0.018..0.018 rows=0 loops=1)
Filter: (timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone)
-> Seq Scan on test_second_1 (cost=0.00..33.13 rows=9 width=16) (actual
time=0.014..0.014 rows=0 loops=1)
Filter: (timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone)

EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01' and
enum_col = 'FIRST';

Append (cost=0.00..37.76 rows=1 width=16) (actual time=0.013..0.013 rows=0
loops=1)
-> Seq Scan on test_first_1 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.013..0.013 rows=0 loops=1)
Filter: ((timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone) AND (enum_col = 'FIRST'::test_enum))

EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01' and
enum_col IN ('FIRST', 'SECOND');

Append (cost=0.00..75.51 rows=2 width=16) (actual time=0.037..0.037 rows=0
loops=1)
-> Seq Scan on test_first_1 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.025..0.025 rows=0 loops=1)
Filter: ((enum_col = ANY ('{FIRST,SECOND}'::test_enum[])) AND
(timestamp_col = '2019-01-01 00:00:00'::timestamp without time zone))
-> Seq Scan on test_second_1 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.010..0.010 rows=0 loops=1)
Filter: ((enum_col = ANY ('{FIRST,SECOND}'::test_enum[])) AND
(timestamp_col = '2019-01-01 00:00:00'::timestamp without time zone))

#######################################################

BONUS NOTE: In Postgres version 10.2 (we migrated from 10.2 to 11.2),
partition pruning were better in cases where we used no enum_col or multiple
enum_col values in where clause:

Only 2 partitions, instead of all four. Timestamp clause is still
respected
EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01';

Append (cost=0.00..66.25 rows=18 width=16) (actual time=0.011..0.011
rows=0 loops=1)
-> Seq Scan on test_first_1 (cost=0.00..33.12 rows=9 width=16) (actual
time=0.007..0.007 rows=0 loops=1)
Filter: (timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone)
-> Seq Scan on test_second_1 (cost=0.00..33.12 rows=9 width=16) (actual
time=0.001..0.001 rows=0 loops=1)
Filter: (timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone)

Two partitions, instead of 1, this was fixed in
https://www.postgresql.org/message-id/2b20c220-50cb-3fae-da57-92e8cb3675dd%40lab.ntt.co.jp
Timestamp clause is respected in both versions
EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01' and
enum_col = 'FIRST';

Append (cost=0.00..75.50 rows=2 width=16) (actual time=0.007..0.007 rows=0
loops=1)
-> Seq Scan on test_first_1 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.004..0.004 rows=0 loops=1)
Filter: ((timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone) AND (enum_col = 'FIRST'::test_enum))
-> Seq Scan on test_second_1 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.002..0.002 rows=0 loops=1)
Filter: ((timestamp_col = '2019-01-01 00:00:00'::timestamp without
time zone) AND (enum_col = 'FIRST'::test_enum))

Two partitions, instead of 4, timestamp clause is still respected.
EXPLAIN ANALYZE SELECT * FROM test WHERE timestamp_col = '2019-01-01' and
enum_col IN ('FIRST', 'SECOND');

Append (cost=0.00..75.50 rows=2 width=16) (actual time=0.004..0.004 rows=0
loops=1)
-> Seq Scan on test_first_1 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.003..0.003 rows=0 loops=1)
Filter: ((enum_col = ANY ('{FIRST,SECOND}'::test_enum[])) AND
(timestamp_col = '2019-01-01 00:00:00'::timestamp without time zone))
-> Seq Scan on test_second_1 (cost=0.00..37.75 rows=1 width=16) (actual
time=0.001..0.001 rows=0 loops=1)
Filter: ((enum_col = ANY ('{FIRST,SECOND}'::test_enum[])) AND
(timestamp_col = '2019-01-01 00:00:00'::timestamp without time zone))

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2019-04-25 12:50:25 BUG #15780: Win10 PGS11 installer includes psAdmin3 LTS which does not support PGS11
Previous Message Amit Langote 2019-04-25 10:02:12 Re: BUG #15672: PostgreSQL 11.1/11.2 crashed after dropping a partition table