Re: Emit fewer vacuum records by reaping removable tuples during pruning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Peter Geoghegan <pg(at)bowt(dot)ie>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Emit fewer vacuum records by reaping removable tuples during pruning
Date: 2024-01-09 18:31:30
Message-ID: CA+TgmoZgQYNVx+W2ks2u4j6AbErY57qR8ecu2Hr-cY76mqQOjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 9, 2024 at 10:56 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> Andres had actually said that he didn't like pushing the update of
> nonempty_pages into lazy_scan_[no]prune(). So, my v4 patch set
> eliminates this.

Mmph - but I was so looking forward to killing hastup!

> On the other hand, the comment above lazy_scan_new_or_empty() says we
> can get rid of this special handling if we make relation extension
> crash safe. Then it would make more sense to have a consolidated FSM
> update in lazy_scan_heap(). However it does still mean that we repeat
> the "UnlockReleaseBuffer()" and FSM update code in even more places.

I wouldn't hold my breath waiting for relation extension to become
crash-safe. Even if you were a whale, you'd be about four orders of
magnitude short of holding it long enough.

> Ultimately I can see arguments for and against. Is it better to avoid
> having the same few lines of code in two places or avoid unneeded
> communication between page-level functions and lazy_scan_heap()?

To me, the conceptual complexity of an extra structure member is a
bigger cost than duplicating TWO lines of code. If we were talking
about 20 lines of code, I'd say rename it to something less dumb.

> > Except for this comment that I found misleading because this is not
> > about the fact that truncation is unsafe, it's about correctly
> > tracking the the last block where we have tuples to ensure a correct
> > truncation. Perhaps this could just reuse "Remember the location of
> > the last page with nonremovable tuples"? If people object to that,
> > feel free.
>
> I agree the comment could be better. But, simply saying that it tracks
> the last page with non-removable tuples makes it less clear how
> important this is. It makes it sound like it could be simply for stats
> purposes. I'll update the comment to something that includes that
> sentiment but is more exact than "rel truncation is unsafe".

I agree that "rel truncation is unsafe" is less clear than desirable,
but I'm not sure that I agree that tracking the last page with
non-removable tuples makes it sound unimportant. However, the comments
also need to make everybody happy, not just me. Maybe something like
"can't truncate away this page" or similar. A long-form comment that
really spells it out is fine too, but I don't know if we really need
it.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2024-01-09 18:46:17 Re: WIP Incremental JSON Parser
Previous Message Robert Haas 2024-01-09 18:18:51 Re: cleanup patches for incremental backup