Re: heapgettup refactoring

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: heapgettup refactoring
Date: 2023-02-07 20:41:27
Message-ID: 20230207204127.7vs6krqjqn5farr7@liskov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 07, 2023 at 05:40:13PM +1300, David Rowley wrote:
> On Fri, 3 Feb 2023 at 15:26, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > I've pushed all but the final 2 patches now.
>
> I just pushed the final patch in the series.

Cool!

> I held back on moving the setting of rs_inited back into the
> heapgettup_initial_block() helper function as I wondered if we should
> even keep that field.
>
> It seems that rs_cblock == InvalidBlockNumber in all cases where
> rs_inited == false, so maybe it's better just to use that as a
> condition to check if the scan has started or not. I've attached a
> patch which does that.
>
> I ended up adjusting HeapScanDescData more than what is minimally
> required to remove rs_inited. I wondered if rs_cindex should be closer
> to rs_cblock in the struct so that we're more likely to be adjusting
> the same cache line than if that field were closer to the end of the
> struct. We don't need rs_coffset and rs_cindex at the same time, so I
> made it a union. I see that the struct is still 712 bytes before and
> after this change. I've not yet tested to see if there are any
> performance gains to this change.
>

I like the idea of using a union.

> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 7eb79cee58..e171d6e38b 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -321,13 +321,15 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
> }
>
> scan->rs_numblocks = InvalidBlockNumber;
> - scan->rs_inited = false;
> scan->rs_ctup.t_data = NULL;
> ItemPointerSetInvalid(&scan->rs_ctup.t_self);
> scan->rs_cbuf = InvalidBuffer;
> scan->rs_cblock = InvalidBlockNumber;
>
> - /* page-at-a-time fields are always invalid when not rs_inited */
> + /*
> + * page-at-a-time fields are always invalid when
> + * rs_cblock == InvalidBlockNumber
> + */

So, I was wondering what we should do about initializing rs_coffset here
since it doesn't fall under "don't initialize it because it is only used
for page-at-a-time mode". It might not be required for us to initialize
it in initscan, but we do bother to initialize other "current scan
state" fields. We could check if pagemode is enabled and initialize
rs_coffset or rs_cindex depending on that.

Then maybe the comment should call out the specific page-at-a-time
fields that are automatically invalid? (e.g. rs_ntuples, rs_vistuples)

I presume the point of the comment is to explain why those fields are
not being initialized here, which was a question I had when I looked at
initscan(), so it seems like we should make sure it explains that.

> @@ -717,9 +720,9 @@ heapgettup_advance_block(HeapScanDesc scan, BlockNumber block, ScanDirection dir
> * the scankeys.
> *
> * Note: when we fall off the end of the scan in either direction, we
> - * reset rs_inited. This means that a further request with the same
> - * scan direction will restart the scan, which is a bit odd, but a
> - * request with the opposite scan direction will start a fresh scan
> + * reset rs_cblock to InvalidBlockNumber. This means that a further request
> + * with the same scan direction will restart the scan, which is a bit odd, but
> + * a request with the opposite scan direction will start a fresh scan
> * in the proper direction. The latter is required behavior for cursors,
> * while the former case is generally undefined behavior in Postgres
> * so we don't care too much.

Not the fault of this patch, but I am having trouble parsing this
comment. What does restart the scan mean? I get that it is undefined
behavior, but it is confusing because it kind of sounds like a rescan
which is not what it is (right?). And what exactly is a fresh scan? It
is probably correct, I just don't really understand what it is saying.
Feel free to ignore this aside, as I think your change is correctly
updating the comment.

> @@ -2321,13 +2316,12 @@ heapam_scan_sample_next_block(TableScanDesc scan, SampleScanState *scanstate)
> ReleaseBuffer(hscan->rs_cbuf);
> hscan->rs_cbuf = InvalidBuffer;
> hscan->rs_cblock = InvalidBlockNumber;
> - hscan->rs_inited = false;
>
> return false;
> }
>
> heapgetpage(scan, blockno);
> - hscan->rs_inited = true;
> + Assert(hscan->rs_cblock != InvalidBlockNumber);

Quite nice to have this assert.

> diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
> index 8d28bc93ef..c6efd59eb5 100644
> --- a/src/include/access/heapam.h
> +++ b/src/include/access/heapam.h
> @@ -56,9 +56,18 @@ typedef struct HeapScanDescData
> /* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" */
>
> /* scan current state */
> - bool rs_inited; /* false = scan not init'd yet */
> - OffsetNumber rs_coffset; /* current offset # in non-page-at-a-time mode */
> - BlockNumber rs_cblock; /* current block # in scan, if any */
> + union
> + {
> + /* current offset in non-page-at-a-time mode */
> + OffsetNumber rs_coffset;
> +
> + /* current tuple's index in vistuples for page-at-a-time mode */
> + int rs_cindex;
> + };

With the union up here, the comment about page-at-a-time mode members
near the bottom of the struct is now a bit misleading.

The rest of my thoughts about that are with that comment.

> +
> + BlockNumber rs_cblock; /* current block # in scan, or
> + * InvalidBlockNumber when the scan is not yet
> + * initialized */

The formatting of this comment is a bit difficult to read. Perhaps it
could go above the member?

Also, a few random other complaints about the comments in
HeapScanDescData that are also not the fault of this patch:

- why is this comment there twice?
/* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" */

- above the union it said (prior to this patch also)
/* scan current state */
which is in contrast to
/* state set up at initscan time */
from the top, but, arguably, rs_strategy is set up at initscan time

- who or what is NB?
/* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */

> Buffer rs_cbuf; /* current buffer in scan, if any */
> /* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
>
> @@ -74,7 +83,6 @@ typedef struct HeapScanDescData
> ParallelBlockTableScanWorkerData *rs_parallelworkerdata;
>
> /* these fields only used in page-at-a-time mode and for bitmap scans */
> - int rs_cindex; /* current tuple's index in vistuples */
> int rs_ntuples; /* number of visible tuples on page */
> OffsetNumber rs_vistuples[MaxHeapTuplesPerPage]; /* their offsets */

I would personally be okay with either a block comment somewhere nearby
describing which members are used in page-at-a-time mode or individual
comments for each field that is used only in page-at-a-time mode (or
something else, I just think the situation with the patch applied
currently is confusing).

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-02-07 20:44:02 Re: PATCH: Using BRIN indexes for sorted output
Previous Message Andres Freund 2023-02-07 20:28:34 Re: Data is copied twice when specifying both child and parent table in publication