Re: Fixing a couple of buglets in how VACUUM sets visibility map bits

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Fixing a couple of buglets in how VACUUM sets visibility map bits
Date: 2023-01-09 19:57:29
Message-ID: 20230109195729.7n5gidgfirsbtsjx@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-01-09 10:16:03 -0800, Peter Geoghegan wrote:
> Attached is v3, which explicitly checks the need to set the PD_ALL_VISIBLE
> flag at the relevant visibilitymap_set() call site. It also has improved
> comments.

Afaict we'll need to backpatch this all the way?

> From e7788ebdb589fb7c6f866cf53658cc369f9858b5 Mon Sep 17 00:00:00 2001
> From: Peter Geoghegan <pg(at)bowt(dot)ie>
> Date: Sat, 31 Dec 2022 15:13:01 -0800
> Subject: [PATCH v3] Don't accidentally unset all-visible bit in VM.
>
> One of the calls to visibilitymap_set() during VACUUM's initial heap
> pass could unset a page's all-visible bit during the process of setting
> the same page's all-frozen bit.

As just mentioned upthread, this just seems wrong.

> This could happen in the event of a
> concurrent HOT update from a transaction that aborts soon after. Since
> the all_visible_according_to_vm local variable that lazy_scan_heap works
> off of when setting the VM doesn't reflect the current state of the VM,
> and since visibilitymap_set() just requested that the all-frozen bit get
> set in one case, there was a race condition. Heap pages could initially
> be all-visible just as all_visible_according_to_vm is established, then
> not be all-visible after the update, and then become eligible to be set
> all-visible once more following pruning by VACUUM. There is no reason
> why VACUUM can't remove a concurrently aborted heap-only tuple right
> away, and so no reason why such a page won't be able to reach the
> relevant visibilitymap_set() call site.

Do you have a reproducer for this?

> @@ -1120,8 +1123,8 @@ lazy_scan_heap(LVRelState *vacrel)
> * got cleared after lazy_scan_skip() was called, so we must recheck
> * with buffer lock before concluding that the VM is corrupt.
> */
> - else if (all_visible_according_to_vm && !PageIsAllVisible(page)
> - && VM_ALL_VISIBLE(vacrel->rel, blkno, &vmbuffer))
> + else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
> + visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
> {
> elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
> vacrel->relname, blkno);

Hm. The message gets a bit less accurate with the change. Perhaps OK? OTOH, it
might be useful to know what bit was wrong when debugging problems.

> @@ -1164,12 +1167,34 @@ lazy_scan_heap(LVRelState *vacrel)
> !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
> {
> /*
> - * We can pass InvalidTransactionId as the cutoff XID here,
> - * because setting the all-frozen bit doesn't cause recovery
> - * conflicts.
> + * Avoid relying on all_visible_according_to_vm as a proxy for the
> + * page-level PD_ALL_VISIBLE bit being set, since it might have
> + * become stale -- even when all_visible is set in prunestate.
> + *
> + * Consider the example of a page that starts out all-visible and
> + * then has a tuple concurrently deleted by an xact that aborts.
> + * The page will be all_visible_according_to_vm, and will have
> + * all_visible set in prunestate. It will nevertheless not have
> + * PD_ALL_VISIBLE set by here (plus neither VM bit will be set).
> + * And so we must check if PD_ALL_VISIBLE needs to be set.
> */
> + if (!PageIsAllVisible(page))
> + {
> + PageSetAllVisible(page);
> + MarkBufferDirty(buf);
> + }
> +
> + /*
> + * Set the page all-frozen (and all-visible) in the VM.
> + *
> + * We can pass InvalidTransactionId as our visibility_cutoff_xid,
> + * since a snapshotConflictHorizon sufficient to make everything
> + * safe for REDO was logged when the page's tuples were frozen.
> + */
> + Assert(!TransactionIdIsValid(prunestate.visibility_cutoff_xid));
> visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
> vmbuffer, InvalidTransactionId,
> + VISIBILITYMAP_ALL_VISIBLE |
> VISIBILITYMAP_ALL_FROZEN);
> }
>
> @@ -1311,7 +1336,11 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
>
> /* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
> if (!vacrel->skipwithvm)
> + {
> + /* Caller shouldn't rely on all_visible_according_to_vm */
> + *next_unskippable_allvis = false;
> break;
> + }
>
> /*
> * Aggressive VACUUM caller can't skip pages just because they are
> @@ -1818,7 +1847,11 @@ retry:
> * cutoff by stepping back from OldestXmin.
> */
> if (prunestate->all_visible && prunestate->all_frozen)
> + {
> + /* Using same cutoff when setting VM is now unnecessary */
> snapshotConflictHorizon = prunestate->visibility_cutoff_xid;
> + prunestate->visibility_cutoff_xid = InvalidTransactionId;
> + }
> else
> {
> /* Avoids false conflicts when hot_standby_feedback in use */
> @@ -2388,8 +2421,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
> static void
> lazy_vacuum_heap_rel(LVRelState *vacrel)
> {
> - int index;
> - BlockNumber vacuumed_pages;
> + int index = 0;
> + BlockNumber vacuumed_pages = 0;
> Buffer vmbuffer = InvalidBuffer;
> LVSavedErrInfo saved_err_info;
>
> @@ -2406,42 +2439,42 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
> VACUUM_ERRCB_PHASE_VACUUM_HEAP,
> InvalidBlockNumber, InvalidOffsetNumber);
>
> - vacuumed_pages = 0;
> -
> - index = 0;
> while (index < vacrel->dead_items->num_items)
> {
> - BlockNumber tblk;
> + BlockNumber blkno;
> Buffer buf;
> Page page;
> Size freespace;
>
> vacuum_delay_point();

I still think such changes are inappropriate for a bugfix, particularly one
that needs to be backpatched.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-01-09 20:03:47 Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Previous Message Andres Freund 2023-01-09 19:44:43 Re: Fixing a couple of buglets in how VACUUM sets visibility map bits