Re: BUG #15309: ERROR: catalog is missing 1 attribute(s) for relid 760676 when max_parallel_maintenance_workers > 0

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: deathlock13(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15309: ERROR: catalog is missing 1 attribute(s) for relid 760676 when max_parallel_maintenance_workers > 0
Date: 2018-08-06 21:21:33
Message-ID: CAH2-Wzn=j0i8rxCAo6E=tBO9XuYXb8HbUsnW7J_StKON8dDOhQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Aug 6, 2018 at 1:37 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> To be clear, I mean that the leader process's worker state has the
> right relfilenode (the leader process always participates as a
> worker), but all worker processes have the stale relfilenode.

Sure enough, that's what the bug is - a few debugging calls to
RelationMapFilenodeToOid() within nbtsort.c proves it. Several
approaches to fixing the bug occur to me:

* Ban parallel CREATE INDEX for all catalogs. This was how things were
up until several weeks before the original patch was committed.

* Ban parallel CREATE INDEX for mapped catalogs only.

* Find a way to propagate the state necessary to have parallel workers
agree with the leader on the correct relfilenode.

We could probably propagate backend-local state like
active_local_updates without too much difficulty, which looks like it
would fix the problem. Note that we did something very similar with
reindex-pending-indexes lists in commit 29d58fd3. That commit
similarly involved propagating more backend-local state so that
parallel index builds (or at least REINDEX) on catalogs could be
enabled/work reliably. Maybe we should continue down the road of
making parallel builds work on catalogs, on general principle.

Thoughts?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2018-08-06 21:29:49 Re: BUG #15309: ERROR: catalog is missing 1 attribute(s) for relid 760676 when max_parallel_maintenance_workers > 0
Previous Message Yahor Yuzefovich 2018-08-06 20:43:08 Re: Docker image of 11~beta2-2 orders strings case-insensitively