Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: noah(at)leadboat(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pryzby(at)telsasoft(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation
Date: 2020-09-08 00:13:53
Message-ID: 20200908.091353.1505164218636547516.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 7 Sep 2020 02:32:55 -0700, Noah Misch <noah(at)leadboat(dot)com> wrote in
> On Mon, Sep 07, 2020 at 05:40:36PM +0900, Kyotaro Horiguchi wrote:
> > At Mon, 07 Sep 2020 13:45:28 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> > > The cause is that the worker had received pending-sync entry correctly
> > > but not never created a relcache entry for the relation using
> > > RelationBuildDesc. So the rd_firstRelfilenodeSubid is not correctly
> > > set.
> > >
> > > I'm investigating it.
> >
> > Relcaches are loaded from a file with old content at parallel worker
> > startup. The relcache entry is corrected by invalidation at taking a
> > lock but pending syncs are not considered.
> >
> > Since parallel workers don't access the files so we can just ignore
> > the assertion safely, but I want to rd_firstRelfilenodeSubid flag at
> > invalidation, as attached PoC patch.
>
> > [patch: When RelationInitPhysicalAddr() handles a mapped relation, re-fill
> > rd_firstRelfilenodeSubid from RelFileNodeSkippingWAL(), like
> > RelationBuildDesc() would do.]
>
> As a PoC, this looks promising. Thanks. Would you add a test case such that
> the following demonstrates the bug in the absence of your PoC?
>
> printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
> make check TEMP_CONFIG=/tmp/minimal.conf

Mmm. I was close to add some tests to 018_wal_optimize.pl but your
suggestion seems better. I added several ines to create_index.sql.

> Please have the test try both a nailed-and-mapped relation and a "nailed, but
> not mapped" relation. I am fairly confident that your PoC fixes the former
> case, but the latter may need additional code.

Mmm. You're right. I choosed pg_amproc_fam_proc_index as
nailed-but-not-mapped index.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Fix-assertion-failure-during-reindex-while-wal_level.patch text/x-patch 3.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-09-08 00:31:11 Re: pgbench and timestamps (bounced)
Previous Message Thomas Munro 2020-09-08 00:07:51 Re: Optimising compactify_tuples()