Re: heapgettup refactoring

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: heapgettup refactoring
Date: 2023-01-30 23:18:45
Message-ID: CAAKRu_Z-PL2pKuNv44wxUia4RshRwD6+Mrzf0m6CedvmyaCTLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

v7 attached

On Fri, Jan 27, 2023 at 10:34 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> "On Wed, 25 Jan 2023 at 10:17, Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > I've added a comment but I didn't include the function name in it -- I
> > find it repetitive when the comments above functions do that -- however,
> > I'm not strongly attached to that.
>
> I think the general format for header comments is:
>
> /*
> * <function name>
> *\t\t<brief summary of what function does>
> *
> * [Further details]
> */
>
> We've certainly got places that don't follow that, but I don't think
> that's any reason to have no comment or invent some new format.
>
> heapam.c seems to have some other format where we do: "<function name>
> - <brief summary of what function does>". I generally just try to copy
> the style from the surrounding code. I think generally, people won't
> argue if you follow the style from the surrounding code, but there'd
> be exceptions to that, I'm sure.

I have followed the same convention as the other functions in heapam.c
in the various helper functions comments I've added in this version.

> v6-0002:
>
> 1. heapgettup_initial_block() needs a header comment to mention what
> it does and what it returns. It would be good to make it obvious that
> it returns InvalidBlockNumber when there are no blocks to scan.

I've done this.

>
> 2. After heapgettup_initial_block(), you're checking "if (block ==
> InvalidBlockNumber). It might be worth a mention something like
>
> /*
> * Check if we got to the end of the scan already. This could happen for
> * an empty relation or if parallel workers scanned everything before we
> * got a chance.
> */
>
> the backward scan comment wouldn't mention parallel workers.

I've done this as well.

>
> v6-0003:
>
> 3. Can you explain why you removed the snapshot local variable in heapgettup()?

In the subsequent commit, the helpers I add call TestForOldSnapshot(),
and I didn't want to pass in the snapshot as a separate parameter since
I already need to pass the scan descriptor. I thought it was confusing
to have a local variable (snapshot) used in some places and the one in
the scan used in others. This "streamlining" commit also reduces the
number of times the snapshot variable is used, making it less necessary
to have a local variable.

I didn't remove the snapshot local variable in the same commit as adding
the helpers because I thought it made the diff harder to understand (for
review, the final commit should likely not be separate patches).

> 4. I think it might be a good idea to use unlikely() in if
> (!scan->rs_inited). The idea is to help coax the compiler into moving
> that code off to a cold path. That's likely especially important if
> heapgettup_initial_block is inlined, which I see it is marked as.

I've gone ahead and added unlikely. However, should I perhaps skip
inlining the heapgettup_initial_block() function?

> v6-0004:
>
> 5. heapgettup_start_page() and heapgettup_continue_page() both need a
> header comment to explain what they do and what the inputs and output
> arguments are.

I've added these. I've also removed an unused parameter to both, block.

>
> 6. I'm not too sure what the following comment means:
>
> /* block and lineoff now reference the physically next tid */
>
> "block" is just a parameter to the function and its value is not
> adjusted. The comment makes it sound like something was changed.
>
> (I think these might be just not well updated from having split this
> out from the 0006 patch as the same comment makes more sense in 0006)

Yes, that is true. I've updated it to just mention lineoff.

> v6-0005:
>
> 7. heapgettup_advance_block() needs a header comment.
>
> 8. Is there a reason why heapgettup_advance_block() handle backward
> scans first? I'd expect you should just follow the lead of the other
> functions and do ScanDirectionIsForward first.

The reason I do this is that backwards scans cannot be parallel, so
handling backwards scans first let me return, then handle parallel
scans, then forward scans. This reduced the level of nesting/if
statements for all of the code in this function.

- Melanie

Attachment Content-Type Size
v7-0002-Add-heapgettup_initial_block-helper.patch text/x-patch 10.0 KB
v7-0001-Remove-NoMovementScanDirection-inconsistencies.patch text/x-patch 13.1 KB
v7-0005-Add-heapgettup_advance_block-helper.patch text/x-patch 7.4 KB
v7-0004-Add-heapgettup-helpers.patch text/x-patch 4.4 KB
v7-0003-Streamline-heapgettup-for-refactor.patch text/x-patch 10.6 KB
v7-0006-Refactor-heapgettup-and-heapgettup_pagemode.patch text/x-patch 9.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-01-30 23:30:40 monitoring usage count distribution
Previous Message Michael Paquier 2023-01-30 23:13:11 Re: recovery modules