Quick Links

Re: BUG #15309: ERROR: catalog is missing 1 attribute(s) for relid 760676 when max_parallel_maintenance_workers > 0

From:	Peter Geoghegan <pg(at)bowt(dot)ie>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	deathlock13(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject:	Re: BUG #15309: ERROR: catalog is missing 1 attribute(s) for relid 760676 when max_parallel_maintenance_workers > 0
Date:	2018-08-06 17:43:15
Message-ID:	CAH2-Wzk129bwcz9WY_veqigCSRB0m+6kgUiDeGB56xZPioOdqA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On Fri, Aug 3, 2018 at 6:53 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> =?utf-8?q?PG_Bug_reporting_form?= <noreply(at)postgresql(dot)org> writes:
>> 11beta2 and git REL_11_STABLE, gcc 8.2, pg.conf : autovacuum=off
>> DB restored from plain-format backup, first 'vacuum full' ends with ERR :
>> "catalog is missing 1 attribute(s) for relid ..." - seem to point to only
>> PrimaryKey, sometimes the same PK
>
> If you could provide a self-contained example, this would be very
> interesting, but there's nothing we can do with just the information
> you've provided here.

I got an off-list testcase from the OP, who writes:

"""
I've attached my 'SHOW ALL' and a backup of example db from PG wiki,
just kept renaming schema/tables until got a large enough number of
indexes - seem to be key to reproducing error(s).

or 2018-08-06 11:20:13 CEST|PUMA|postgres|[local]| DETAIL: Key
(relname, relnamespace)=(idx_last_name, 16498) already exists.

Got this after runing your query with amcheck before 'VACUUM FULL'
2018-08-06 11:51:10 CEST|||| PANIC: could not open critical system index 2676
2018-08-06 11:51:10 CEST|TEST|postgres|[local]| ERROR: could not open
critical system index 2676
2018-08-06 11:51:10 CEST|TEST|postgres|[local]| CONTEXT: parallel worker
2018-08-06 11:51:10 CEST|TEST|postgres|[local]| STATEMENT: vacuum full;
2018-08-06 11:51:10 CEST|||| LOG: background worker "parallel worker"
(PID 16618) was terminated by signal 6: Aborted
2018-08-06 11:51:10 CEST|||| LOG: terminating any other active server processes

2018-08-06 12:11:24 CEST|||| ERROR: could not open relation with OID 2696
2018-08-06 12:11:24 CEST|TEST|postgres|[local]| ERROR: could not open
relation with OID 2696
2018-08-06 12:11:24 CEST|TEST|postgres|[local]| CONTEXT: parallel worker
2018-08-06 12:11:24 CEST|TEST|postgres|[local]| STATEMENT: vacuum full;
2018-08-06 12:11:24 CEST|||| LOG: background worker "parallel worker"
(PID 16930) exited with exit code 1
"""

I can reproduce this with the backup provided. This seems to boil down
to the following:

1. Create a database with tens of thousands of relations, each of
which contain no data.

2. Run an unqualified VACUUM FULL.

This is what I see after the restore:

pg(at)foodb[22648]=# vacuum FULL ;
ERROR: duplicate key value violates unique constraint
"pg_class_relname_nsp_index"
DETAIL: Key (relname, relnamespace)=(customer_pkey, 16445) already exists.
pg(at)foodb[22648]=# :amcheck
ERROR: heap tuple (358,1) from table "pg_attribute" lacks matching
index tuple within index "pg_attribute_relid_attnam_index"

I'll work to isolate and diagnose the problem today. It likely has
something to do with corrupting the state needed by a catalog parallel
index build in the context of the VACUUM FULL. pg_attribute grows to
several tens of megabytes here, which is enough to get a parallel
index build.

--
Peter Geoghegan

In response to

Re: BUG #15309: ERROR: catalog is missing 1 attribute(s) for relid 760676 when max_parallel_maintenance_workers > 0 at 2018-08-03 13:53:09 from Tom Lane

Responses

Re: BUG #15309: ERROR: catalog is missing 1 attribute(s) for relid 760676 when max_parallel_maintenance_workers > 0 at 2018-08-06 20:31:16 from Peter Geoghegan

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Yahor Yuzefovich	2018-08-06 19:30:00	Re: Docker image of 11~beta2-2 orders strings case-insensitively
Previous Message	Mario de Frutos Dieguez	2018-08-06 16:48:19	Re: Fwd: Problem with a "complex" upsert