Re: BUG #18973: The default enable_material=ON affects the cost estimation of optimizer, resulting in 10968x slow

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: jinhui-lai(at)foxmail(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18973: The default enable_material=ON affects the cost estimation of optimizer, resulting in 10968x slow
Date: 2025-06-30 14:56:58
Message-ID: 2826336.1751295418@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> I may have found a performance issue. The parameter enable_material is set
> to ON by default, and it affects the cost estimation of optimizer, resulting
> in 10968x slow. You can reproduce it as follows:
> CREATE TABLE t0(c0 INT8);
> CREATE TABLE t1(c1 INT8);
> INSERT INTO t1 SELECT i FROM generate_series(1, 100000000) AS i;
> SELECT * FROM t0 INNER JOIN t1 ON t0.c0 != t1.c1;
> c0 | c1
> ----+----
> (0 rows)
> Time: 9794.016 ms (00:09.794)

The problem with this example is that you didn't ANALYZE the tables.
If you do, it switches to a plan without Materialize:

regression=# CREATE TABLE t0(c0 INT8);
CREATE TABLE
regression=# CREATE TABLE t1(c1 INT8);
CREATE TABLE
regression=# INSERT INTO t1 SELECT i FROM generate_series(1, 100000000) AS i;
INSERT 0 100000000
regression=# explain analyze SELECT * FROM t0 INNER JOIN t1 ON t0.c0 != t1.c1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..3391443465.73 rows=224870062964 width=16) (actual time=19992.481..19992.483 rows=0.00 loops=1)
Join Filter: (t0.c0 <> t1.c1)
Buffers: shared read=442478 dirtied=442478 written=428541
-> Seq Scan on t1 (cost=0.00..1442478.28 rows=100000028 width=8) (actual time=0.136..11957.262 rows=100000000.00 loops=1)
Buffers: shared read=442478 dirtied=442478 written=428541
-> Materialize (cost=0.00..43.90 rows=2260 width=8) (actual time=0.000..0.000 rows=0.00 loops=100000000)
Storage: Memory Maximum Storage: 17kB
-> Seq Scan on t0 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.005..0.005 rows=0.00 loops=1)
Planning:
Buffers: shared hit=68 read=33
Planning Time: 4.135 ms
Execution Time: 19992.525 ms
(12 rows)

regression=# vacuum analyze t0,t1;
VACUUM
regression=# explain analyze SELECT * FROM t0 INNER JOIN t1 ON t0.c0 != t1.c1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2692478.72 rows=100000031 width=16) (actual time=0.004..0.005 rows=0.00 loops=1)
Join Filter: (t0.c0 <> t1.c1)
-> Seq Scan on t0 (cost=0.00..0.00 rows=1 width=8) (actual time=0.004..0.004 rows=0.00 loops=1)
-> Seq Scan on t1 (cost=0.00..1442478.32 rows=100000032 width=8) (never executed)
Planning:
Buffers: shared hit=9
Planning Time: 0.094 ms
Execution Time: 0.017 ms
(8 rows)

But really that's kind of cheating, because it depends critically
on t0 being completely empty. If we add a row there so that the
join has to do some work, there is not so much value after all:

regression=# insert into t0 values(1);
INSERT 0 1
regression=# vacuum analyze t0;
VACUUM
regression=# explain analyze SELECT * FROM t0 INNER JOIN t1 ON t0.c0 != t1.c1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2692479.73 rows=100000031 width=16) (actual time=0.051..11894.867 rows=99999999.00 loops=1)
Join Filter: (t0.c0 <> t1.c1)
Rows Removed by Join Filter: 1
Buffers: shared hit=15701 read=426778
-> Seq Scan on t0 (cost=0.00..1.01 rows=1 width=8) (actual time=0.003..0.005 rows=1.00 loops=1)
Buffers: shared hit=1
-> Seq Scan on t1 (cost=0.00..1442478.32 rows=100000032 width=8) (actual time=0.044..3853.565 rows=100000000.00 loops=1)
Buffers: shared hit=15700 read=426778
Planning:
Buffers: shared hit=6
Planning Time: 0.068 ms
Execution Time: 14050.387 ms
(12 rows)

We don't optimize for the case of tables being completely
empty, because that's basically a zero-probability situation
in real-world queries. So even though this don't-scan-the-
inner-table-when-the-outer-one-is-empty short-circuit exists
in the executor, the optimizer does not plan on the assumption
of that happening. That's not a bug, it's intentional.
We judge that a plan made on that assumption will be too
brittle if the table turns out to not be empty after all.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2025-06-30 15:14:43 Re: BUG #18970: Atempt to alter type of table column used in row type with check leads to assertion failure
Previous Message PG Bug reporting form 2025-06-30 13:44:19 BUG #18973: The default enable_material=ON affects the cost estimation of optimizer, resulting in 10968x slow