Re: error context for vacuum to include block number

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: error context for vacuum to include block number
Date: 2020-03-26 15:04:57
Message-ID: 20200326150457.GB17431@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 26, 2020 at 08:56:54PM +0900, Masahiko Sawada wrote:
> 1.
> @@ -1844,9 +1914,15 @@ lazy_vacuum_page(Relation onerel, BlockNumber
> blkno, Buffer buffer,
> int uncnt = 0;
> TransactionId visibility_cutoff_xid;
> bool all_frozen;
> + LVRelStats olderrcbarg;
>
> pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
>
> + /* Update error traceback information */
> + olderrcbarg = *vacrelstats;
> + update_vacuum_error_cbarg(vacrelstats, VACUUM_ERRCB_PHASE_VACUUM_HEAP,
> + blkno, NULL, false);
>
> Since we update vacrelstats->blkno during in the loop in
> lazy_vacuum_heap() we unnecessarily update blkno twice to the same
> value. Also I think we don't need to revert back the callback
> arguments in lazy_vacuum_page(). Perhaps we can either remove the
> change of lazy_vacuum_page() or move the code updating
> vacrelstats->blkno to the beginning of lazy_vacuum_page(). I prefer
> the latter.

We want the error callback to be in place during lazy_scan_heap, since it
calls ReadBufferExtended().

We can't remove the change in lazy_vacuum_page, since it's also called from
lazy_scan_heap, if there are no indexes.

We want lazy_vacuum_page to "revert back" since we go from "scanning heap" to
"vacuuming heap". lazy_vacuum_page was the motivation for saving and restoring
the called arguments, otherwise lazy_scan_heap() would have to clean up after
the function it called, which was unclean. Now, every function cleans up after
itself.

Does that address your comment ?

> +static void
> +update_vacuum_error_cbarg(LVRelStats *errcbarg, int phase, BlockNumber blkno,
> + char *indname, bool free_oldindname)
>
> I'm not sure why "free_oldindname" is necessary. Since we initialize
> vacrelstats->indname with NULL and revert the callback arguments at
> the end of functions that needs update them, vacrelstats->indname is
> NULL at the beginning of lazy_vacuum_index() and lazy_cleanup_index().
> And we make a copy of index name in update_vacuum_error_cbarg(). So I
> think we can pfree the old index name if errcbarg->indname is not NULL.

We want to avoid doing this:
olderrcbarg = *vacrelstats // saves a pointer
update_vacuum_error_cbarg(... NULL); // frees the pointer and sets indname to NULL
update_vacuum_error_cbarg(... olderrcbarg.oldindnam) // puts back the pointer, which has been freed
// hit an error, and the callback accesses the pfreed pointer

I think that's only an issue for lazy_vacuum_index().

And I think you're right: we only save state when the calling function has a
indname=NULL, so we never "put back" a non-NULL indname. We go from having a
indname=NULL at lazy_scan_heap to not not-NULL at lazy_vacuum_index, and never
the other way around. So once we've "reverted back", 1) the pointer is null;
and, 2) the callback function doesn't access it for the previous/reverted phase
anyway.

Hm, I was just wondering what happens if an error happens *during*
update_vacuum_error_cbarg(). It seems like if we set
errcbarg->phase=VACUUM_INDEX before setting errcbarg->indname=indname, then an
error would cause a crash. And if we pfree and set indname before phase, it'd
be a problem when going from an index phase to non-index phase. So maybe we
have to set errcbarg->phase=VACUUM_ERRCB_PHASE_UNKNOWN while in the function,
and errcbarg->phase=phase last.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2020-03-26 15:11:08 Re: pgsql: Provide a TLS init hook
Previous Message Konstantin Knizhnik 2020-03-26 14:43:49 Re: Columns correlation and adaptive query optimization