Quick Links

Re: Partial hash index is not used for implied qual.

From:	Sergei Glukhov <s(dot)glukhov(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Partial hash index is not used for implied qual.
Date:	2025-11-25 11:50:59
Message-ID:	cb0b9982-ca1a-4f03-9d7f-f9ca678be35c@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 11/25/25 6:01 AM, Tom Lane wrote:
> I wrote:
>> Wouldn't it be better to handle it more like the is_target_rel logic
>> a few lines further up?
> Actually, after thinking a bit longer, it'd be better to do something
> like the attached so that we don't keep redundant quals unless they'd
> *all* be excluded.
>
> There's definitely something fishy about the costing though.
> I experimented with this variant of Sergei's example:
>
> regression=# CREATE TABLE hash_partial(x) AS SELECT x % 100 as y from generate_series(1, 1000) as x;
> SELECT 1000
> regression=# ANALYZE hash_partial;
> ANALYZE
> regression=# CREATE INDEX partial_idx ON hash_partial USING hash(x) WHERE x = 1;
> CREATE INDEX
> regression=# set enable_seqscan TO 0; -- else we'll go for a seqscan
> SET
> regression=# EXPLAIN SELECT x FROM hash_partial WHERE x = 1;
> QUERY PLAN
> ----------------------------------------------------------------------------
> Bitmap Heap Scan on hash_partial (cost=24.08..32.56 rows=10 width=4)
> Recheck Cond: (x = 1)
> -> Bitmap Index Scan on partial_idx (cost=0.00..24.07 rows=10 width=0)
> Index Cond: (x = 1)
> (4 rows)
>
> regression=# drop index partial_idx;
> DROP INDEX
> regression=# CREATE INDEX ON hash_partial USING hash(x);
> CREATE INDEX
> regression=# EXPLAIN SELECT x FROM hash_partial WHERE x = 1;
> QUERY PLAN
> ----------------------------------------------------------------------------------
> Bitmap Heap Scan on hash_partial (cost=4.08..12.56 rows=10 width=4)
> Recheck Cond: (x = 1)
> -> Bitmap Index Scan on hash_partial_x_idx (cost=0.00..4.08 rows=10 width=0)
> Index Cond: (x = 1)
> (4 rows)
>
> Why are we thinking that a non-partial index would be substantially
> cheaper to scan? That seems surely wrong, and it runs counter to my
> intuition about why this fix is incomplete. (I expected an unfair
> bias towards the partial index, not against it.)
>
> regards, tom lane
>

Thanks for the fix. It seems there is another case for investigation:

DROP TABLE hash_partial;
CREATE TABLE hash_partial(x, y) AS
SELECT x, x + x as y from generate_series(1, 1000) as x;
ANALYZE hash_partial;
CREATE INDEX partial_idx ON hash_partial USING hash(x) WHERE x = 1;
SET enable_seqscan TO 0;
EXPLAIN SELECT x FROM hash_partial WHERE x = 1 and y < 0;
--------------------------------------------------------------------------------
Seq Scan on hash_partial (cost=0.00..23.00 rows=1 width=4)
Disabled: true
Filter: ((y < 0) AND (x = 1))
(3 rows)

Regarding strangeness of the cost,
cost is depends on numIndexPages and
in genericcostestimate() we calulate numIndexPages:

numIndexPages = ceil(numIndexTuples * index->pages / index->tuples);

For non-partial index index->pages = 6 and index->tuples = 1000
and for partial index index->pages = 6 and index->tuples = 10.
Number of pages depends on index relation size and
initial size is 6 * BLCKSZ for both, partial and non-partial hash indexes
Initial size of the hash index relation, in turn,
depends on total number of tuples in the table.

Regards,
Sergei Glukhov

In response to

Re: Partial hash index is not used for implied qual. at 2025-11-25 02:01:12 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Xuneng Zhou	2025-11-25 11:51:19	Re: Implement waiting for wal lsn replay: reloaded
Previous Message	Zhijie Hou (Fujitsu)	2025-11-25 11:48:48	RE: How can end users know the cause of LR slot sync delays?