Re: BUG #15672: PostgreSQL 11.1/11.2 crashed after dropping a partition table

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, jianingy(dot)yang(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BUG #15672: PostgreSQL 11.1/11.2 crashed after dropping a partition table
Date: 2019-04-25 10:02:12
Message-ID: 6aa609be-9295-e2a4-f6d4-266e2ca79f56@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On 2019/04/25 11:21, Amit Langote wrote:
> On 2019/04/25 8:27, Tom Lane wrote:
>> BTW, I hadn't ever looked closely at what the index reuse code
>> does, and now that I have, my heart just sinks. I think that
>> logic needs to be nuked from orbit. RelationPreserveStorage was
>> never meant to be abused in this way; I invite you to contemplate
>> whether it's not a problem that it doesn't check the backend and
>> nestLevel fields of PendingRelDelete entries before killing them.
>> (In its originally-designed use for mapped rels at transaction end,
>> that wasn't a problem, but I'm afraid that it may be one now.)
>>
>> The right way to do this IMO would be something closer to the
>> heap-swap logic in cluster.c, where we exchange the relfilenodes
>> of two live indexes, rather than what is happening now. Or for
>> that matter, do we really need to delete the old indexes at all?
>
> Yeah, we wouldn't need TryReuseIndex and subsequent
> RelationPreserveStorage if we hadn't dropped the old indexes to begin
> with. Instead, in ATPostAlterTypeParse, check if the index after ALTER
> TYPE is still incompatible (CheckIndexCompatible) and if it is, don't add
> a new AT_ReAddIndex command. If it's not, *then* drop the index, and
> recreate the index from scratch using an IndexStmt generated from the old
> index definition. I guess We can get rid of IndexStmt.oldNode too.

Thinking on this more and growing confident that we could indeed avoid
drop index + recreate-it-while-preserving-storage, instead by just not
doing anything when CheckIndexCompatible says the old index will be fine
despite ALTER TYPE, but only if the table is not rewritten. I gave this a
try and came up with the attached patch. It fixes the bug related to
partitioned indexes (the originally reported one) and then some.

Basically, I aimed to rewrite the code in ATPostAlterTypeCleanup and
ATPostAlterTypeParse such that we no longer have to rely on an
implementation based on setting "oldNode" to preserve old indexes. With
the attached, for the cases in which the table won't be rewritten and
hence the indexes not rebuilt, ATPostAlterTypeParse() simply won't queue a
AT_ReAddIndex command to rebuild index while preserving the storage. That
means both ATAddIndex() and DefineIndex can be freed of the duty of
looking out for the "oldNode" case, because that case no longer exists.

Another main change is that inherited (!conislocal) constraints are now
recognized by ATPostAlterTypeParse directly, instead of
ATPostAlterTypeCleanup checking for them and skipping
ATPostAlterTypeParse() as a whole for such constraints. For one, I had to
make that change to make the above-described approach work. Also, doing
that allowed to fix another bug whereby the comments of child constraints
would go away when they're reconstructed. Notice what happens on
un-patched PG 11:

create table pp (a int, b text, unique (a, b), c varchar(64)) partition by
list (a);
create table pp1 partition of pp for values in (1);
create table pp2 partition of pp for values in (2);
alter table pp add constraint c_chk check (c <> '');
comment on constraint c_chk ON pp is 'parent check constraint';
comment on constraint c_chk ON pp1 is 'child check constraint 1';
comment on constraint c_chk ON pp2 is 'child check constraint 2';
select conname, obj_description(oid, 'pg_constraint') from pg_constraint
where conname = 'c_chk';
conname │ obj_description
─────────┼──────────────────────────
c_chk │ parent check constraint
c_chk │ child check constraint 1
c_chk │ child check constraint 2
(3 rows)

alter table pp alter c type varchar(64);

select conname, obj_description(oid, 'pg_constraint') from pg_constraint
where conname = 'c_chk';
conname │ obj_description
─────────┼─────────────────────────
c_chk │ parent check constraint
c_chk │
c_chk │
(3 rows)

The patch fixes that with some surgery of RebuildConstraintComment
combined with aforementioned changes. With the patch:

alter table pp alter c type varchar(64);

select conname, obj_description(oid, 'pg_constraint') from pg_constraint
where conname = 'c_chk';
conname │ obj_description
─────────┼──────────────────────────
c_chk │ parent check constraint
c_chk │ child check constraint 1
c_chk │ child check constraint 2
(3 rows)

alter table pp alter c type varchar(128);

select conname, obj_description(oid, 'pg_constraint') from pg_constraint
where conname = 'c_chk';
conname │ obj_description
─────────┼──────────────────────────
c_chk │ parent check constraint
c_chk │ child check constraint 1
c_chk │ child check constraint 2
(3 rows)

Also for index comments, but only in the case when indexes are not rebuilt.

comment on index pp_a_b_key is 'parent index';
comment on index pp1_a_b_key is 'child index 1';
comment on index pp2_a_b_key is 'child index 2';

select relname, relfilenode, obj_description(oid,'pg_class') from pg_class
where relname like 'pp%key';
relname │ relfilenode │ obj_description
─────────────┼─────────────┼─────────────────
pp1_a_b_key │ 17280 │ child index 1
pp2_a_b_key │ 17284 │ child index 2
pp_a_b_key │ 17271 │ parent index
(3 rows)

-- no rewrite, indexes untouched, comments preserved
alter table pp alter b type varchar(128);

select relname, relfilenode, obj_description(oid,'pg_class') from pg_class
where relname like 'pp%key';
relname │ relfilenode │ obj_description
─────────────┼─────────────┼─────────────────
pp1_a_b_key │ 17280 │ child index 1
pp2_a_b_key │ 17284 │ child index 2
pp_a_b_key │ 17271 │ parent index
(3 rows)

-- table rewritten, indexes rebuild, child indexes' comments gone
alter table pp alter b type varchar(64);

select relname, relfilenode, obj_description(oid,'pg_class') from pg_class
where relname like 'pp%key';
relname │ relfilenode │ obj_description
─────────────┼─────────────┼─────────────────
pp1_a_b_key │ 17294 │
pp2_a_b_key │ 17298 │
pp_a_b_key │ 17285 │ parent index
(3 rows)

I've also added tests for both the originally reported bug and the comment
ones.

The patch applies to PG 11.

Thanks,
Amit

Attachment Content-Type Size
ALTER-TYPE-preserve-old-index-comment-1.patch text/plain 20.9 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2019-04-25 11:33:38 BUG #15779: Partition elimination doesn't work as expected when using PARTITION BY RANGE
Previous Message Pavel Stehule 2019-04-25 09:43:01 Re: bug: evil autoConcat when each string is on new line

Browse pgsql-hackers by date

  From Date Subject
Next Message Shaoqi Bai 2019-04-25 10:37:04 Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Previous Message Kyotaro HORIGUCHI 2019-04-25 09:50:28 Comment fix for renamed functions