Transactional sequence stuff breaks pg_upgrade

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Transactional sequence stuff breaks pg_upgrade
Date: 2017-06-11 21:17:07
Message-ID: 19785.1497215827@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I believe I've identified the reason why skink and some other buildfarm
members have been failing the pg_upgrade test recently. It is that
recent changes in sequence support have caused binary-upgrade restore
runs to do some sequence OID/relfilenode assignments without any heed
to the OIDs that pg_upgrade tried to impose on those sequences. Once
those sequences have relfilenodes other than the intended ones, they
are land mines for all subsequent pg_upgrade-controlled table OID
assignments.

I am not very sure why it's so hard to duplicate the misbehavior; perhaps,
in order to make the failure happen with the current regression tests,
it's necessary for a background auto-analyze to happen and consume some
OIDs (for pg_statistic TOAST entries) at just the wrong time. However,
I can definitely demonstrate that there are uncontrolled relfilenode
assignments happening during pg_upgrade's restore run. I stuck an
elog() call into GetNewObjectId(), along with generation of a stack
trace using backtrace(), and here is one example:

[593daad3.4863:2243] LOG: generated OID 16735
[593daad3.4863:2244] STATEMENT:
-- For binary upgrade, must preserve pg_class oids
SELECT pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('46851'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_type oid
SELECT pg_catalog.binary_upgrade_set_next_pg_type_oid('46852'::pg_catalog.oid);

ALTER TABLE "itest10" ALTER COLUMN "a" ADD GENERATED BY DEFAULT AS IDENTITY (
SEQUENCE NAME "itest10_a_seq"
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1
);

postgres: postgres regression [local] ALTER TABLE(GetNewObjectId+0xda) [0x50397a]
postgres: postgres regression [local] ALTER TABLE(GetNewRelFileNode+0xec) [0x52430c]
postgres: postgres regression [local] ALTER TABLE(RelationSetNewRelfilenode+0x79) [0x851d59]
postgres: postgres regression [local] ALTER TABLE(AlterSequence+0x1cd) [0x5d976d]
postgres: postgres regression [local] ALTER TABLE() [0x75d279]
postgres: postgres regression [local] ALTER TABLE(standard_ProcessUtility+0xb7) [0x75dec7]
postgres: postgres regression [local] ALTER TABLE() [0x75cb1d]
postgres: postgres regression [local] ALTER TABLE(standard_ProcessUtility+0xb7) [0x75dec7]
postgres: postgres regression [local] ALTER TABLE() [0x759f0b]
postgres: postgres regression [local] ALTER TABLE() [0x75ae91]
postgres: postgres regression [local] ALTER TABLE(PortalRun+0x250) [0x75b740]
postgres: postgres regression [local] ALTER TABLE() [0x757be7]
postgres: postgres regression [local] ALTER TABLE(PostgresMain+0xe08) [0x759968]
postgres: postgres regression [local] ALTER TABLE(PostmasterMain+0x1a99) [0x6e21a9]
postgres: postgres regression [local] ALTER TABLE(main+0x6b8) [0x65b958]
/lib64/libc.so.6(__libc_start_main+0xfd) [0x3f3bc1ed1d]
postgres: postgres regression [local] ALTER TABLE() [0x473899]

Judging by when we started to see buildfarm failures, I think that
commit 3d79013b9 probably broke it, but the problem seems latent
in the whole concept of transactional sequence information.

Not sure what we want to do about it. One idea is to make
ALTER SEQUENCE not so transactional when in binary-upgrade mode.

(I'm also tempted to make GetNewRelFileNode complain if IsBinaryUpgrade
is true, but that's a separate matter.)

In any case, this is a "must fix" problem IMO, so I'll go add it to the
open items list.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2017-06-11 21:29:18 Re: PG10 transition tables, wCTEs and multiple operations on the same table
Previous Message Piotr Stefaniak 2017-06-11 21:14:36 Re: pgindent (was Re: [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.)