Re: Use of additional index columns in rows filtering

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Maxim Ivanov <hi(at)yamlcoder(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, markus(dot)winand(at)winand(dot)at
Subject: Re: Use of additional index columns in rows filtering
Date: 2023-08-09 17:00:34
Message-ID: 2a8089ec-9e3a-14a5-024c-a2784cfe367f@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/8/23 22:54, Peter Geoghegan wrote:
> On Tue, Aug 8, 2023 at 1:24 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>> Assuming that that happens, then it immediately gives index scans a
>>> huge advantage over bitmap index scans. At that point it seems
>>> important to describe (in high level terms) where it is that the
>>> advantage is innate, and where it's just because we haven't done the
>>> required work for bitmap index scans. I became confused on this point
>>> myself yesterday. Admittedly I should have been able to figure it out
>>> on my own -- but it is confusing.
>>>
>>
>> Yeah, I agree that might help a lot, particularly for tables that have a
>> significant fraction of not-all-visible pages.
>
> It also has the potential to make the costing a lot easier in certain
> important cases. Accurately deriving just how many heap accesses can
> be avoided via the VM from the statistics that are available to the
> planner is likely always going to be very difficult. Finding a way to
> make that just not matter at all (in these important cases) can also
> make it safe to bias the costing, such that the planner tends to favor
> index scans (and index-only scans) over bitmap index scans that cannot
> possibly eliminate any heap page accesses via an index filter qual.
>

Yes, if there's a way to safely skip the visibility check for some
conditions, that would probably make the costing simpler.

Anyway, I find this discussion rather abstract and I'll probably forget
half the important cases by next week. Maybe it'd be good to build a set
of examples demonstrating the interesting cases? We've already used a
couple tenk1 queries for that purpose ...

>> Right, and I'm not against improving that, but I see it more like an
>> independent task. I don't think it needs (or should) to be part of this
>> patch - skipping visibility checks would apply to IOS, while this is
>> aimed only at plain index scans.
>
> I'm certainly not going to insist on it. Worth considering if putting
> it in scope could make certain aspects of this patch (like the
> costing) easier, though.
>
> I think that it wouldn't be terribly difficult to make simple
> inequalities into true index quals. I think I'd like to have a go at
> it myself. To some degree I'm trying to get a sense of how much that'd
> help you.
>

I'm trying to make the patch to not dependent on such change. In a way,
once a clause gets recognized as index qual, it becomes irrelevant for
my patch. But the patch also doesn't get any simpler, because it still
needs to do the same thing for the remaining quals.

OTOH if there was some facility to decide if a qual is "safe" to be
executed on the index tuple, that'd be nice. But as I already said, I
see it more as an additional optimization, as it only applies to a
subset of cases.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-08-09 17:08:05 Re: Using defines for protocol characters
Previous Message Peter Geoghegan 2023-08-09 16:54:12 Re: Use of additional index columns in rows filtering