Re: Tid scan improvements

From: Edmund Horner <ejrh00(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Tid scan improvements
Date: 2018-09-19 06:04:43
Message-ID: CAMyN-kCCjuLR=guOHBbD66BE4EuiDe8R5mDrmBJOySntdUS6Xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 17 Sep 2018 at 23:21, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> On 15 August 2018 at 11:11, Edmund Horner <ejrh00(at)gmail(dot)com> wrote:
>> So we'd extend that to:
>> - Include in the OR-list "range" subquals of the form (ctid > ? AND
>> ctid < ?) (either side could be optional, and we have to deal with >=
>> and <= and having ctid on the rhs, etc.).
>> - Cost the range subquals by assuming they don't overlap, and
>> estimating how many blocks and tuples they span.
>> - When beginning the scan, evaluate all the ?s and build an array of
>> "tid ranges" to fetch. A tid range is a struct with a starting tid,
>> and an ending tid, and might just be a single tid item.
>> - Sort and remove duplicates.
>> - Iterate over the array, using a single fetch for single-item tid
>> ranges, and starting/ending a heap scan for multi-item tid ranges.
>>
>> I think I'll try implementing this.
>
>
> I've set this patch as waiting on author in the commitfest app.

Thanks David.

Between work I have found time here and there to work on it, but
making a path type that handles all the above turns out to be
surprisingly harder than my tid range scan.

In the earlier discussion from 2012, Tom Lane said:

> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > On Wed, Jun 13, 2012 at 03:21:17PM -0500, Merlin Moncure wrote:
> >> IMNSHO, it's a no-brainer for the todo (but I think it's more
> >> complicated than adding some comparisons -- which are working now):
>
> > I see. Seems we have to add index smarts to those comparisons. That
> > might be complicated.
>
> Uh, the whole point of a TID scan is to *not* need an index.
>
> What would be needed is for tidpath.c to let through more kinds of TID
> comparison quals than it does now, and then for nodeTidscan.c to know
> what to do with them. The latter logic might well look something like
> btree indexscan qual preparation, but it wouldn't be the same code.

I have been generally following this approach (handling more kinds of
TID comparisons), and have found myself doing things like pairing up >
with <, estimating how much of a table is covered by some set of >, <,
or "> AND <" quals, etc. Things that I'm sure are handled in an
advanced way by index paths; unfortunately I didn't see any easily
reusable code in the index path code. So I've ended up writing
special-case code for TID scans. Hopefully it will be worth it.

Edmund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-09-19 06:13:03 Re: Changing the setting of wal_sender_timeout per standby
Previous Message Michael Paquier 2018-09-19 05:40:31 Re: Changing the setting of wal_sender_timeout per standby