Re: Tid scan improvements

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Edmund Horner <ejrh00(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Tid scan improvements
Date: 2021-02-04 10:51:39
Message-ID: CAApHDvrdoop-9ojBMLX0x_xa0=6v_trhXM8soAwSb5yebaWPgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 4 Feb 2021 at 10:31, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> Thanks for looking at this.
>
> On Thu, 4 Feb 2021 at 10:19, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Perhaps something like
> >
> > typedef struct TableScanTidRange TableScanTidRange;
> >
> > TableScanTidRange* table_scan_tid_range_start(TableScanDesc sscan, ItemPointer mintid, ItemPointer maxtid);
> > bool table_scan_tid_range_nextslot(TableScanDesc sscan, TableScanTidRange *range, TupleTableSlot *slot);
> > void table_scan_tid_range_end(TableScanDesc sscan, TableScanTidRange* range);
> >
> > would work better? That'd allow an AM to have arbitrarily large state
> > for a tid range scan, would make it clear what the lifetime of the
> > ItemPointer mintid, ItemPointer maxtid are etc. Wouldn't, on the API
> > level, prevent multiple tid range scans from being in progress at the
> > same times though :(. Perhaps we could add a TableScanTidRange* pointer
> > to TableScanDesc which'd be checked/set by tableam.h which'd prevent that?
>
> Maybe the TableScanTidRange can just have a field to store the
> TableScanDesc. That way table_scan_tid_range_nextslot and
> table_scan_tid_range_end can just pass the TableScanTidRange pointer.
>
> That way it seems like it would be ok for multiple scans to be going
> on concurrently as nobody should be reusing the TableScanDesc.

I ended up adding just two new API functions to table AM.

void (*scan_set_tid_range) (TableScanDesc sscan,
ItemPointer mintid,
ItemPointer maxtid);

and
bool (*scan_tid_range_nextslot) (TableScanDesc sscan,
ScanDirection direction,
TupleTableSlot *slot);

I added an additional function in tableam.h that does not have a
corresponding API function:

static inline TableScanDesc
table_tid_range_start(Relation rel, Snapshot snapshot,
ItemPointer mintid,
ItemPointer maxtid)

This just calls the standard scan_begin then calls scan_set_tid_range
setting the specified mintid and maxtid.

I also added 2 new fields to TableScanDesc:

ItemPointerData rs_mintid;
ItemPointerData rs_maxtid;

I didn't quite see a need to have a new start and end scan API function.

Updated patch attached.

David

Attachment Content-Type Size
v13-0001-Add-TID-Range-Scans-to-support-efficient-scannin.patch text/plain 72.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2021-02-04 11:00:05 Re: Removing support for COPY FROM STDIN in protocol version 2
Previous Message Amit Kapila 2021-02-04 10:46:18 Re: Parallel INSERT (INTO ... SELECT ...)