Re: Partial hash index is not used for implied qual.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Sergei Glukhov <s(dot)glukhov(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Partial hash index is not used for implied qual.
Date: 2025-11-27 04:04:43
Message-ID: 36827.1764216283@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> On Thu, 27 Nov 2025 at 07:26, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I checked the costing calculations and it's basically that
>> genericcostestimate doesn't understand about hash buckets.

> I assume you must mean using your "x % 100" case rather than Sergei's case.

Right.

>> ... tuples in the index, so it supposes that that means we'll visit
>> 1/100th of the index's pages (rounded up to 1).

> I'm not so clear on why this is bad. The index has 1000 tuples on 6
> pages and we want to scan 1% of the rows in the index. As a result,
> genericcostestimate() calculates that's 1 page. How many pages would
> you expect to scan?

To be clear, neither of genericcostestimate's estimates are bad as
a generic estimate. And I'm not even sure that we could do better
with a hash-aware estimate implemented in hashcostestimate. I think
the key thing about this test case is that the partial index ends
up with all its entries in one hash bucket. I'm not sure that we
could reasonably expect to know that in the cost estimator. Even
if we could, should we really expend a lot of effort on the case?
It's a textbook example of when you should not use a hash index.

I'm interested in fixing this can't-generate-this-plan-shape bug
because it probably impacts more-realistic use-cases. But I
think the cost estimation problem is probably specific to cases
where you shouldn't have used a hash index, so I'm okay with
ignoring that part. Until more evidence arrives, anyway.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-11-27 04:17:11 Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache
Previous Message Shlok Kyal 2025-11-27 03:55:23 Re: How can end users know the cause of LR slot sync delays?