Re: prion failed with ERROR: missing chunk number 0 for toast value 14334 in pg_toast_2619

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: prion failed with ERROR: missing chunk number 0 for toast value 14334 in pg_toast_2619
Date: 2021-10-18 04:21:28
Message-ID: 20211018042128.GB4679@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 17, 2021 at 04:43:15PM -0500, Justin Pryzby wrote:
> On Sun, Aug 15, 2021 at 09:44:55AM -0500, Justin Pryzby wrote:
> > On Sun, May 16, 2021 at 04:23:02PM -0400, Tom Lane wrote:
> > > 1. Fix FullXidRelativeTo to be a little less trusting. It'd
> > > probably be sane to make it return FirstNormalTransactionId
> > > when it'd otherwise produce a wrapped-around FullXid, but is
> > > there any situation where we'd want it to throw an error instead?
> > >
> > > 2. Change pg_resetwal to not do the above. It's not entirely
> > > apparent to me what business it has trying to force
> > > autovacuum-for-wraparound anyway, but if it does need to do that,
> > > can we devise a less klugy method?
> > >
> > > It also seems like some assertions in procarray.c would be a
> > > good idea. With the attached patch, we get through core
> > > regression just fine, but the pg_upgrade test fails immediately
> > > after the "Resetting WAL archives" step.
> >
> > #2 is done as of 74cf7d46a.
> >
> > Is there a plan to include Tom's procarray assertions ?
>
> I'm confused about the state of this patch/thread.
>
> make check causes autovacuum crashes (but then the regression tests succeed
> anyway).

Sorry, I was confused here. autovacuum is not crashing as I said; the
BACKTRACE lines from the LOG added by Tom's debugging patch:

+ if (trace_toast_visibility)
+ ereport(LOG,
+ errmsg("HeapTupleSatisfiesToast: xmin %u t_infomask 0x%04x",
+ HeapTupleHeaderGetXmin(tuple),
+ tuple->t_infomask),
+ debug_query_string ? 0 : errbacktrace());

2021-10-17 22:56:57.066 CDT autovacuum worker[19601] LOG: HeapTupleSatisfiesToast: xmin 2 t_infomask 0x0b02
2021-10-17 22:56:57.066 CDT autovacuum worker[19601] BACKTRACE:
...

I see that the pg_statistic problem can still occur on v14. I still don't have
a recipe to reproduce it, though, other than running VACUUM FULL in a loop.
Can I provide anything useful to debug it? xmin, infomask, core, and
log_autovacuum_min_duration=0 ??

--
Justin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-10-18 05:17:56 Re: pgsql: Document XLOG_INCLUDE_XID a little better
Previous Message Tom Lane 2021-10-18 04:17:07 Re: [Bug] Logical Replication failing if the DateStyle is different in Publisher & Subscriber