Re: BUG #16251: ::text type casting of a constant breaks query performance

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: imyfess(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16251: ::text type casting of a constant breaks query performance
Date: 2020-02-08 07:09:11
Message-ID: CAFj8pRAADe1n7GhBXZevEf2kTZ0r=RfZBA5-No0zcQhjOY88jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

so 8. 2. 2020 v 7:44 odesílatel PG Bug reporting form <
noreply(at)postgresql(dot)org> napsal:

> The following bug has been logged on the website:
>
> Bug reference: 16251
> Logged by: Dima Pavlov
> Email address: imyfess(at)gmail(dot)com
> PostgreSQL version: 12.1
> Operating system: Windows 10
> Description:
>
> Test table and indexes (PostgreSQL 12.1):
>
> CREATE TABLE t (dt timestamp with time zone);
> CREATE INDEX ind ON t USING btree (dt);
>
> INSERT
> INTO t(dt)
> SELECT
> (
> timestamp '2020-01-01 00:00:00' +
> random() * (
> timestamp '2020-02-29 00:00:00' -
> timestamp '2020-01-01 00:00:00'
> )
> )
> FROM generate_series(1, 10000)
>
> -------------------------------------
>
> In the first query, everything is ok, appropriate index "ind" is used:
>
> explain (analyze, buffers)
> SELECT *
> FROM t
> WHERE
> ('2020-02-08')::date IS NULL
> OR
> dt > '2020-02-08'
> ORDER BY dt
> LIMIT 1
>
> "Limit (cost=0.29..0.37 rows=1 width=8) (actual time=0.186..0.188 rows=1
> loops=1)"
> " Buffers: shared hit=3"
> " -> Index Only Scan using ind on t (cost=0.29..303.75 rows=3627
> width=8)
> (actual time=0.184..0.184 rows=1 loops=1)"
> " Index Cond: (dt > '2020-02-08 00:00:00+05'::timestamp with time
> zone)"
> " Heap Fetches: 1"
> " Buffers: shared hit=3"
> "Planning Time: 2.365 ms"
> "Execution Time: 0.239 ms"
>
> -----------------------------------------------
>
> With '::text' type casting of '2020-02-08' (which is already text) query
> permofance is very low
>
> explain (analyze, buffers)
> SELECT *
> FROM t
> WHERE
> ('2020-02-08'::text)::date IS NULL
> OR
> dt > '2020-02-08'
> ORDER BY dt
> LIMIT 1
>
> "Limit (cost=0.29..0.44 rows=1 width=8) (actual time=45.306..45.307 rows=1
> loops=1)"
> " Buffers: shared hit=6232"
> " -> Index Only Scan using ind on t (cost=0.29..561.28 rows=3658
> width=8)
> (actual time=45.304..45.304 rows=1 loops=1)"
> " Filter: ((('2020-02-08'::cstring)::date IS NULL) OR (dt >
> '2020-02-08 00:00:00+05'::timestamp with time zone))"
> " Rows Removed by Filter: 6367"
> " Heap Fetches: 6368"
> " Buffers: shared hit=6232"
> "Planning Time: 0.348 ms"
> "Execution Time: 45.343 ms"
>

This is not a bug, but just feature.

The '2020-02-08' is not text type - it is 'unknown' type - and then is just
directly transformed to date. I think so cast from text to date is not
maybe immutable, and it can stops some optimizations.

postgres=# explain analyze select * from foo where '2020-02-01'::date is
null or a > '2020-01-01';
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN

╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Index Only Scan using foo_a_idx on foo (cost=0.42..9.12 rows=40 width=8)
(actual time=0.083..0.125 rows=37 loops=1) │
│ Index Cond: (a > '2020-01-01 00:00:00+01'::timestamp with time zone)

│ Heap Fetches: 37

│ Planning Time: 0.279 ms

│ Execution Time: 0.189 ms

└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)

postgres=# explain analyze select * from foo where '2020-02-01'::text::date
is null or a > '2020-01-01';
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN

╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Seq Scan on foo (cost=0.00..2193.00 rows=540 width=8) (actual
time=0.050..71.671 rows=37 loops=1) │
│ Filter: ((('2020-02-01'::cstring)::date IS NULL) OR (a > '2020-01-01
00:00:00+01'::timestamp with time zone)) │
│ Rows Removed by Filter: 99963

│ Planning Time: 0.299 ms

│ Execution Time: 71.714 ms

└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)

The problem is in casting from Date to Timestamptz - the related function
"timestamptz" is marked as "stable" - and then probably the all expression
is marked as "stable", what block to use this expression as index condition.

I don't know the context, but the expression "'2020-02-01'::date is null or
a > '2020-01-01'" has not too much sense, so just don't do this. Postgres
is not too smart and don't try to reduce some useless part of expressions.

Or if you need it, then use UNION and separate this expression to two
independent expressions

postgres=# explain analyze select * from foo where '2020-02-01'::text::date
is null union select * from foo where a > '2020-01-01';
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN

╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ HashAggregate (cost=3202.82..4203.22 rows=100040 width=8) (actual
time=0.175..1.635 rows=37 loops=1) │
│ Group Key: foo.a

│ -> Append (cost=0.01..2952.72 rows=100040 width=8) (actual
time=0.061..0.120 rows=37 loops=1) │
│ -> Result (cost=0.01..1443.01 rows=100000 width=8) (actual
time=0.016..0.017 rows=0 loops=1) │
│ One-Time Filter: (('2020-02-01'::cstring)::date IS NULL)

│ -> Seq Scan on foo (cost=0.01..1443.01 rows=100000
width=8) (never executed) │
│ -> Index Only Scan using foo_a_idx on foo foo_1
(cost=0.42..9.12 rows=40 width=8) (actual time=0.043..0.090 rows=37
loops=1) │
│ Index Cond: (a > '2020-01-01 00:00:00+01'::timestamp with
time zone) │
│ Heap Fetches: 37

│ Planning Time: 0.437 ms

│ Execution Time: 6.690 ms

└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(11 rows)

Regards

Pavel

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Pavel Stehule 2020-02-08 07:28:07 Re: BUG #16251: ::text type casting of a constant breaks query performance
Previous Message PG Bug reporting form 2020-02-08 06:43:22 BUG #16251: ::text type casting of a constant breaks query performance