Re: Emit fewer vacuum records by reaping removable tuples during pruning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Michael Paquier <michael(at)paquier(dot)xyz>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Emit fewer vacuum records by reaping removable tuples during pruning
Date: 2024-01-15 17:29:57
Message-ID: CA+Tgmoa9e+Xk5onuV-qiCDXy9YqSq5=9BmrTx0g3TwgayKKeOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 12, 2024 at 4:05 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> Yea, that works for now. I mean, I think the way we should do it is
> update the FSM in lazy_scan_noprune(), but, for the purposes of this
> patch, yes. has_lpdead_items output parameter seems fine to me.

Here's v2.

It's not exactly clear to me why you want to update the FSM in
lazy_scan_[no]prune(). When I first looked at v7-0004, I said to
myself "well, this is dumb, because now we're just duplicating
something that is common to both cases". But then I realized that the
logic was *already* duplicated in lazy_scan_heap(), and that by
pushing the duplication down to lazy_scan_[no]prune(), you made the
two copies identical rather than, as at present, having two copies
that differ from each other. Perhaps that's a sufficient reason to
make the change all by itself, but it seems like what would be really
good is if we only needed one copy of the logic. I don't know if
that's achievable, though.

More generally, I somewhat question whether it's really right to push
things from lazy_scan_heap() and into lazy_scan_[no]prune(), mostly
because of the risk of having to duplicate logic. I'm not even really
convinced that it's good for us to have both of those functions.
There's an awful lot of code duplication between them already. Each
has a loop that tests the status of each item and then, for LP_USED,
switches on htsv_get_valid_status or HeapTupleSatisfiesVacuum. It
doesn't seem trivial to unify all that code, but it doesn't seem very
nice to have two copies of it, either.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v2-0001-Be-more-consistent-about-whether-to-update-the-FS.patch application/octet-stream 5.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-01-15 17:47:58 Re: Oom on temp (un-analyzed table caused by JIT) V16.1
Previous Message Robert Haas 2024-01-15 16:58:04 Re: cleanup patches for incremental backup