Broken defenses against dropping a partitioning column

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Manuel Rigger <rigger(dot)manuel(at)gmail(dot)com>
Subject: Broken defenses against dropping a partitioning column
Date: 2019-07-07 19:11:43
Message-ID: 31920.1562526703@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(Moved from pgsql-bugs thread at [1])

Consider

regression=# create domain d1 as int;
CREATE DOMAIN
regression=# create table t1 (f1 d1) partition by range(f1);
CREATE TABLE
regression=# alter table t1 drop column f1;
ERROR: cannot drop column named in partition key

So far so good, but that defense has more holes than a hunk of
Swiss cheese:

regression=# drop domain d1 cascade;
psql: NOTICE: drop cascades to column f1 of table t1
DROP DOMAIN

Of course, the table is now utterly broken, e.g.

regression=# \d t1
psql: ERROR: cache lookup failed for type 0

(More-likely variants of this include dropping an extension that
defines the type of a partitioning column, or dropping the schema
containing such a type.)

The fix I was speculating about in the pgsql-bugs thread was to add
explicit pg_depend entries making the table's partitioning columns
internally dependent on the whole table (or maybe the other way around;
haven't experimented). That fix has a couple of problems though:

1. In the example, "drop domain d1 cascade" would automatically
cascade to the whole partitioned table, including child partitions
of course. This might leave a user sad, if a few terabytes of
valuable data went away; though one could argue that they'd better
have paid more attention to what the cascade cascaded to.

2. It doesn't fix anything for pre-existing tables in pre-v12 branches.

I thought of a different possible approach, which is to move the
"cannot drop column named in partition key" error check from
ATExecDropColumn(), where it is now, to RemoveAttributeById().
That would be back-patchable, but the implication would be that
dropping anything that a partitioning column depends on would be
impossible, even with CASCADE; you'd have to manually drop the
partitioned table first. Good for data safety, but a horrible
violation of expectations, and likely of the SQL spec as well.
I'm not sure we could avoid order-of-traversal problems, either.

Ideally, perhaps, a DROP CASCADE like this would not cascade to
the whole table but only to the table's partitioned-ness property,
leaving you with a non-partitioned table with most of its data
intact. It would take a lot of work to make that happen though,
and it certainly wouldn't be back-patchable, and I'm not really
sure it's worth it.

Thoughts?

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CA%2Bu7OA4JKCPFrdrAbOs7XBiCyD61XJxeNav4LefkSmBLQ-Vobg%40mail.gmail.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Steven Pousty 2019-07-07 19:26:11 Re: Switching PL/Python to Python 3 by default in PostgreSQL 12
Previous Message Peter Eisentraut 2019-07-07 19:05:26 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)