BUG #15334: Partition elimination not working as expected when using enum as partition key

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: damir(dot)ciganovic(dot)jankovic(at)gmail(dot)com
Subject: BUG #15334: Partition elimination not working as expected when using enum as partition key
Date: 2018-08-16 14:05:17
Message-ID: 153442831773.1504.1930023229211807055@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15334
Logged by: Damir Ciganović-Janković
Email address: damir(dot)ciganovic(dot)jankovic(at)gmail(dot)com
PostgreSQL version: 10.2
Operating system: Linux (Centos), Windows
Description:

Our tables are partitioned by partition key RANGE (resolution, timestamp).

Timestamp is of type 'timestamp without time zone' and resolution is hour
own enum type which contains values 'HOUR' , 'DAY' and 'MONTH':
CREATE TYPE resolution AS ENUM (
'HOUR',
'DAY',
'MONTH'
);

Our table "my_report" has timestamp, resolution and count columns.

This is the query I am executing:
select "my_report"."count" from my_report where ("my_report"."resolution" =
'HOUR'::resolution and "my_report"."timestamp" >= timestamp '2018-08-16
07:00:00' and "my_report"."timestamp" < timestamp '2018-08-16 10:00:00.0')
limit 10;
(NOTE: I simplified the query so real query and execution plan are
different, but I think you will understand me)

I noticed that when doing this query via psql cmd I get this execution plan

+++++
...
-> Append (cost=0.00..159.76 rows=1 width=24) (actual time=0.585..0.585
rows=0 loops=1)
-> Seq Scan on my_report_hour_20180816 (cost=0.00..159.76 rows=1
width=24) (actual time=0.584..0.584 rows=0 loops=1)
Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
zone) AND (resolution = 'HOUR'::resolution))
Rows Removed by Filter: 4672
...
++++++

This is the correct behaviour, we picked the partition where our timestamps
are in range, and also resolution is HOUR.
Now, when executing this same query via our client (Java), but passing the
resolution as a parameter (and not hardcoding HOUR value in my query like
first time). I would expect that execution plan will be the same, but that
is not the case:

++++++
...
-> Append (cost=0.00..372.24 rows=3 width=24) (actual time=1.117..1.117
rows=0 loops=1)
-> Seq Scan on my_report_hour_20180816 (cost=0.00..183.12 rows=1
width=24) (actual time=0.589..0.589 rows=0 loops=1)
Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
zone) AND (resolution = ('HOUR'::cstring)::resolution))
Rows Removed by Filter: 4672
-> Seq Scan on my_report_day_201808 (cost=0.00..94.56 rows=1 width=24)
(actual time=0.265..0.265 rows=0 loops=1)
Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
zone) AND (resolution = ('HOUR'::cstring)::resolution))
Rows Removed by Filter: 2336
-> Seq Scan on my_report_month_201808 (cost=0.00..94.56 rows=1
width=24) (actual time=0.261..0.261 rows=0 loops=1)
Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
zone) AND (resolution = ('HOUR'::cstring)::resolution))
Rows Removed by Filter: 2336
...
++++++

As we can see here, our execution will do seq scan on all my_report
partitions for that period (resolutions: day, month and hour) even though we
put HOUR as our query parameter. One thing that I observed is that insead of
(resolution = 'HOUR'::resolution)) like we got in the first example, filters
look like this (resolution = ('HOUR'::cstring)::resolution)).

It seems to me that PostgreSQL is not removing the (::cstring) part before
partition elimination so that it is forced to go through all of them
bassically ignoring the resolution value.

I have a workaround now by putting exact value in the query (I wrote exact
string `"my_report"."resolution" = 'DAY'::resolution`) into my query, and
not passing the value as parameter, but this looks to me as a bug.

Thanks in advance

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2018-08-16 15:25:45 Re: BUG #15333: pg_dump error on large table -- "pg_dump: could not stat file...Unknown error"
Previous Message PG Bug reporting form 2018-08-16 12:51:54 BUG #15333: pg_dump error on large table -- "pg_dump: could not stat file...Unknown error"