Re: New IndexAM API controlling index vacuum strategies

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-04-05 11:30:00
Message-ID: CAD21AoAkBG3vJ2-Mqe-feq4CkHyg5wTGFjpAs4yMPq9HxuAmAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 4, 2021 at 11:00 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Thu, Apr 1, 2021 at 8:25 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > I've also found a way to further simplify the table-without-indexes
> > case: make it behave like a regular two-pass/has-indexes VACUUM with
> > regard to visibility map stuff when the page doesn't need a call to
> > lazy_vacuum_heap() (because there are no LP_DEAD items to set
> > LP_UNUSED on the page following pruning). But when it does call
> > lazy_vacuum_heap(), the call takes care of everything for
> > lazy_scan_heap(), which just continues to the next page due to
> > considering prunestate to have been "invalidated" by the call to
> > lazy_vacuum_heap(). So there is absolutely minimal special case code
> > for the table-without-indexes case now.
>
> Attached is v10, which simplifies the one-pass/table-without-indexes
> VACUUM as described.
>

Thank you for updating the patch.

>
> * I now include a modified version of Matthias van de Meent's line
> pointer truncation patch [1].
>
> Matthias' patch seems very much in scope here. The broader patch
> series establishes the principle that we can leave LP_DEAD line
> pointers in an unreclaimed state indefinitely, without consequence
> (beyond the obvious). We had better avoid line pointer bloat that
> cannot be reversed when VACUUM does eventually get around to doing a
> second pass over the heap. This is another case where it seems prudent
> to keep the costs understandable/linear -- page-level line pointer
> bloat seems like a cost that increases in a non-linear fashion, which
> undermines the whole idea of modelling when it's okay to skip
> index/heap vacuuming. (Also, line pointer bloat sucks.)
>
> Line pointer truncation doesn't happen during pruning, as it did in
> Matthias' original patch. In this revised version, line pointer
> truncation occurs during the second phase of VACUUM. There are several
> reasons to prefer this approach. It seems both safer and more useful
> that way (compared to the approach of doing line pointer truncation
> during pruning). It also makes intuitive sense to do it this way, at
> least to me -- the second pass over the heap is supposed to be for
> "freeing" LP_DEAD line pointers.

+1

0002, 0003, and 0004 patches look good to me. 0001 and 0005 also look
good to me but I have some trivial review comments on them.

0001 patch:

/*
- * Now that stats[idx] points to the DSM segment, we
don't need the
- * locally allocated results.
+ * Now that top-level indstats[idx] points to the DSM
segment, we
+ * don't need the locally allocated results.
*/
- pfree(*stats);
- *stats = bulkdelete_res;
+ pfree(istat);
+ istat = bulkdelete_res;

Did you try the change around parallel_process_one_index() that I
suggested in the previous reply[1]? If we don't change the logic, we
need to update the above comment. Previously, we update stats[idx] in
vacuum_one_index() (renamed to parallel_process_one_index()) but with
your patch, where we update it is its caller.

---
+lazy_vacuum_all_indexes(LVRelState *vacrel)
{
- Assert(!IsParallelWorker());
- Assert(nindexes > 0);
+ Assert(vacrel->nindexes > 0);
+ Assert(TransactionIdIsNormal(vacrel->relfrozenxid));
+ Assert(MultiXactIdIsValid(vacrel->relminmxid));

and

- Assert(!IsParallelWorker());
- Assert(nindexes > 0);
+ Assert(vacrel->nindexes > 0);

We removed two Assert(!IsParallelWorker()) at two places. It seems to
me that those assertions are still valid. Do we really need to remove
them?

---
0004 patch:

src/backend/access/heap/heapam.c:638: trailing whitespace.
+ /*

I found a whitespace issue.

---
0005 patch:

+ * Caller is expected to call here before and after vacuuming each index in
+ * the case of two-pass VACUUM, or every BYPASS_EMERGENCY_MIN_PAGES blocks in
+ * the case of no-indexes/one-pass VACUUM.

I think it should be "every VACUUM_FSM_EVERY_PAGES blocks" instead of
"every BYPASS_EMERGENCY_MIN_PAGES blocks".

---
+/*
+ * Threshold that controls whether we bypass index vacuuming and heap
+ * vacuuming. When we're under the threshold they're deemed unnecessary.
+ * BYPASS_THRESHOLD_PAGES is applied as a multiplier on the table's rel_pages
+ * for those pages known to contain one or more LP_DEAD items.
+ */
+#define BYPASS_THRESHOLD_PAGES 0.02 /* i.e. 2% of rel_pages */
+
+#define BYPASS_EMERGENCY_MIN_PAGES \
+ ((BlockNumber) (((uint64) 4 * 1024 * 1024 * 1024) / BLCKSZ))
+

I think we need a description for BYPASS_EMERGENCY_MIN_PAGES.

---
for (int idx = 0; idx < vacrel->nindexes; idx++)
{
Relation indrel = vacrel->indrels[idx];
IndexBulkDeleteResult *istat = vacrel->indstats[idx];

vacrel->indstats[idx] =
lazy_vacuum_one_index(indrel, istat, vacrel->old_live_tuples,
vacrel);
+
+ if (should_speedup_failsafe(vacrel))
+ {
+ /* Wraparound emergency -- end current index scan */
+ allindexes = false;
+ break;
+ }

allindexes can be false even if we process all indexes, which is fine
with me because setting allindexes = false disables the subsequent
heap vacuuming. I think it's appropriate behavior in emergency cases.
In that sense, can we do should_speedup_failsafe() check also after
parallel index vacuuming? And we can also check it at the beginning of
lazy vacuum.

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoDOWo4H6vmtLZoJ2SznMp_zOej2Kww%2BJBkVRPXs%2Bj48uw%40mail.gmail.com

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-04-05 11:55:04 Re: Flaky vacuum truncate test in reloptions.sql
Previous Message Bharath Rupireddy 2021-04-05 11:15:32 Re: [Patch] ALTER SYSTEM READ ONLY