Re: Tid scan improvements

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Edmund Horner <ejrh00(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Tid scan improvements
Date: 2021-02-19 07:37:07
Message-ID: CAApHDvqoO33Em0tWoJvwN2cqvAJ45nB+qqfS5cxLbQ45BGPwkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 18 Feb 2021 at 09:45, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Wed, 17 Feb 2021 at 11:05, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Architecturally it feels like this is something that really belongs more
> > into plan time?
>
> Possibly. It would mean TidOpExpr would have to become a Node type.
> TID Range scan is really just following the lead set by TID Scan here.
>
> I'm not sure if it's worth the trouble making these Node types for the
> small gains there'd be in the performance of having the planner make
> them once for prepared queries rather than them having to be built
> during each execution.

I changed the code around and added a new Node type to the planner and
made it create the TidRangeExpr during planning.

However, I'm pretty much set on this being pretty horrible and I ended
up ripping it back out again. The reason is that there's quite a bit
of extra boilerplate code that goes with the new node type. e.g it
must be handled in setrefs.c. EXPLAIN also needs to know about the
new Node. That either means teaching the deparse code about
TidRangeExprs or having the Plan node carry along the OpExprs just so
we can make EXPLAIN work. Translating between the two might be
possible but it just seemed too much code and I started feeling pretty
bad about the whole idea.

> > > +/*
> > > + * table_set_tidrange resets the minimum and maximum TID range to scan for a
> > > + * TableScanDesc created by table_beginscan_tidrange.
> > > + */
> > > +static inline void
> > > +table_set_tidrange(TableScanDesc sscan, ItemPointer mintid,
> > > + ItemPointer maxtid)
> > > +{
> > > + /* Ensure table_beginscan_tidrange() was used. */
> > > + Assert((sscan->rs_flags & SO_TYPE_TIDRANGESCAN) != 0);
> > > +
> > > + sscan->rs_rd->rd_tableam->scan_set_tidrange(sscan, mintid, maxtid);
> > > +}
> >
> > How does this interact with rescans?
>
> We must call table_rescan() before calling table_set_tidrange() again.
> That perhaps could be documented better. I'm just unsure if that
> should be documented in tableam.h or if it's a restriction that only
> needs to exist in heapam.c

I've changed things around so that we no longer explicitly call
table_rescan() in nodeTidrangescan.c. Instead table_set_tidrange()
does a rescan call. I also adjusted the documentation to mention that
changing the tid range starts the scan again. This does mean we'll do
a ->scan_rescan() the first time we do table_set_tidrange(). I'm not
all that sure that matters.

v15 attached.

David

Attachment Content-Type Size
v15-0001-Add-TID-Range-Scans-to-support-efficient-scannin.patch text/plain 72.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2021-02-19 07:39:46 Re: Problem with accessing TOAST data in stored procedures
Previous Message Michael Paquier 2021-02-19 07:34:41 Re: pg_config_h.in not up-to-date