From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
Subject: | Re: display offset along with block number in vacuum errors |
Date: | 2020-08-01 06:17:21 |
Message-ID: | 20200801061721.GF20393@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jul 31, 2020 at 04:55:14PM -0500, Justin Pryzby wrote:
> Bcc:
> Subject: Re: display offset along with block number in vacuum errors
> Reply-To:
> In-Reply-To: <CAKYtNApLJjAaRw0UEBBY6G1o0LRZKS7rA5n46BFh+NfwSOycdg(at)mail(dot)gmail(dot)com>
whoops
> On Wed, Jul 29, 2020 at 12:35:17AM +0530, Mahendra Singh Thalor wrote:
> > > Here:
> > >
> > > @@ -1924,14 +1932,22 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
> > > BlockNumber tblk;
> > > OffsetNumber toff;
> > > ItemId itemid;
> > > + LVSavedErrInfo loc_saved_err_info;
> > >
> > > tblk = ItemPointerGetBlockNumber(&dead_tuples->itemptrs[tupindex]);
> > > if (tblk != blkno)
> > > break; /* past end of tuples for this block */
> > > toff = ItemPointerGetOffsetNumber(&dead_tuples->itemptrs[tupindex]);
> > > +
> > > + /* Update error traceback information */
> > > + update_vacuum_error_info(vacrelstats, &loc_saved_err_info, VACUUM_ERRCB_PHASE_VACUUM_HEAP,
> > > + blkno, toff);
> > > itemid = PageGetItemId(page, toff);
> > > ItemIdSetUnused(itemid);
> > > unused[uncnt++] = toff;
> > > +
> > > + /* Revert to the previous phase information for error traceback */
> > > + restore_vacuum_error_info(vacrelstats, &loc_saved_err_info);
> > > }
> > >
> > > I'm not sure why you use restore_vacuum_error_info() at all. It's already
> > > called at the end of lazy_vacuum_page() (and others) to allow functions to
> > > clean up after their own state changes, rather than requiring callers to do it.
> > > I don't think you should use it in a loop, nor introduce another
> > > LVSavedErrInfo.
> > >
> > > Since phase and blkno are already set, I think you should just set
> > > vacrelstats->offnum = toff, rather than calling update_vacuum_error_info().
> > > Similar to whats done in lazy_vacuum_heap():
> > >
> > > tblk = ItemPointerGetBlockNumber(&vacrelstats->dead_tuples->itemptrs[tupindex]);
> > > vacrelstats->blkno = tblk;
> >
> > Fixed.
>
> I rearead this thread and I think the earlier suggestion from Masahiko was
> right. The loop around dead_tuples only does ItemIdSetUnused() which updates
> the page, which has already been read from disk. On my suggestion, your v2
> patch sets offnum directly, but now I think it's not useful to set at all,
> since the whole page is manipulated by PageRepairFragmentation() and
> log_heap_clean(). An error there would misleadingly say "..at offset number
> MM", but would always show the page's last offset, and not the offset where an
> error occured.
This makes me question whether offset numbers are ever useful during
VACUUM_HEAP, since the real work is done a page at a time (not tuple) or by
internal functions that don't update vacrelstats->offno. Note that my initial
problem report that led to the errcontext implementation was an ERROR in heap
*scan* (not vacuum). So an offset number at that point would've been
sufficient.
https://www.postgresql.org/message-id/20190808012436.GG11185@telsasoft.com
I mentioned that lazy_check_needs_freeze() should save and restore the errinfo,
so an error in heap_page_prune() (for example) doesn't get the wrong offset
associated with it.
So see the attached changes on top of your v2 patch.
postgres=# DROP TABLE tt; CREATE TABLE tt(a int) WITH (fillfactor=90); INSERT INTO tt SELECT generate_series(1,99999); VACUUM tt; UPDATE tt SET a=1 WHERE ctid='(345,10)'; UPDATE pg_class SET relfrozenxid=(relfrozenxid::text::int + (1<<30))::int::text::xid WHERE oid='tt'::regclass; VACUUM tt;
ERROR: found xmin 1961 from before relfrozenxid 1073743785
CONTEXT: while scanning block 345 of relation "public.tt", item offset 205
Hmm.. is it confusing that the block number is 0-indexed but the offset is
1-indexed ?
--
Justin
Attachment | Content-Type | Size |
---|---|---|
0001-Added-offset-with-block-number-in-vacuum-errcontext.patch | text/x-diff | 10.2 KB |
0002-fix.patch | text/x-diff | 4.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Mahendra Singh Thalor | 2020-08-01 07:01:53 | Re: display offset along with block number in vacuum errors |
Previous Message | vignesh C | 2020-08-01 04:24:54 | Re: Parallel copy |