Re: BUG #15873: Attaching a partition fails because it sees deleted columns

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: dwilches(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15873: Attaching a partition fails because it sees deleted columns
Date: 2019-06-26 04:32:37
Message-ID: CA+HiwqHApJRxNRv5N5Z70E7OcajLqxNwC3P=Z5qBdCp_CqW6cg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Jun 26, 2019 at 6:50 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> > Bug reference: 15873
> > PostgreSQL version: 11.2
>
> > I'm trying to attach a partition to one of my partitioned tables and I'm
> > getting an error:
> > ERROR: 42703: attribute 4 of type my_table_000000_partition has been
> > dropped
>
> FWIW, I tried to reproduce this using the attached example. Works
> fine here. However, this report sounds quite a lot like some partitioning
> bugs that have been fixed in the past. Could you update to 11.4 and try
> again? If it still fails, then there's some aspect of your problem table
> that you have not told us about.

I was able to reproduce the error that OP is seeing and some more.

To reproduce, you'd need to rejigger defchild's attributes such that
it contains a dropped column in the attnum position where pparent
contains its partition key.

> drop table if exists pparent;
> create table pparent(f1 int, z int, f2 int, my_timestamp timestamptz, f3 int)
> partition by range (my_timestamp);

So, change this:

> create table defchild (f1 int, f2 int,
> bogus1 text, my_timestamp timestamptz, bogus2 text, f3 int);

to:

create table defchild (f1 int, f2 int, bogus1 text, bogus2 text,
my_timestamp timestamptz, f3 int);

> alter table defchild drop column bogus1, drop column bogus2;

Now, pparent contains my_timestamp (the partition key) in 4th
position, whereas defchild contains a dropped column in that position.

> alter table pparent attach partition defchild default;
>
> create table otherchild (like defchild including all);
>
> alter table pparent attach partition otherchild
> for values from ('2019-01-01') to ('2020-01-01');

Finally, you need to insert at least one row into the default
partition before attaching 'otherrel'.

insert into pparent values (1, 1, '2019-01-01', 1);

alter table pparent attach partition otherchild
for values from ('2019-01-01') to ('2020-01-01');
ERROR: attribute 4 of type defchild has been dropped

This error is thrown when defchild is scanned to check if it contains
any rows that better have not been there (because they belonged to the
partition being attached).

The bug seems to be that the partition constraint qual expression
that's used for the scan hasn't been translated to bear the default
partition's attnos. Attached patch fixes it.

When fixing this, I noticed another problem that's caused by a
different instance of the same bug. See the following example:

-- normal case
create table p (a int, b int) partition by list (b);
create table p_def partition of p default;
alter table p_def add check (b < 0);
create table p_1 partition of p for values in (1);
INFO: updated partition constraint for default partition "p_def" is
implied by existing constraints

-- buggy case
create table p (a int, b int) partition by list (b);
create table p_def (a int, c int, b int);
alter table p_def drop c;
alter table p attach partition p_def default;
alter table p_def add check (b < 0);
-- no INFO message
create table p_1 partition of p for values in (1);

The problem here is that an partition constraint qual expression
passed to predtest.c hasn't been translated to match p_def's attnos.

Attached fixed that too.

Then I noticed a bug that's worse than the above two, whereby
check_default_partition_contents() scans the default partition with
only a partial expression, preventing the scan to detect violating
rows. It uses only the first element of the list of expressions
returned by make_ands_implicit(), assuming that there'd be only one,
but the Assert(list_length(<default-partition-constr-expr-list>) == 1)
fails make check. See the following example:

create table rlp3 (b varchar, a int) partition by list (b varchar_ops);
create table rlp3_default partition of rlp3 default;
create table rlp3abcd partition of rlp3 for values in ('ab', 'cd');
create table rlp3efgh partition of rlp3 for values in ('ef', 'gh');
insert into rlp3_default values ('xy');

-- this should've thrown an error that rlp3_default contains 'xy'
which shouldn't
-- be there
create table rlp3nullxy partition of rlp3 for values in (null, 'xy');
insert into rlp3 values ('xy');

-- so now, there's 'xy' in two partitions!
select tableoid::regclass, * from rlp3;
tableoid │ b │ a
──────────────┼────┼───
rlp3nullxy │ xy │
rlp3_default │ xy │
(2 rows)

Attached fixes that too.

Thanks,
Amit

Attachment Content-Type Size
default-part-qual-bugs.patch application/octet-stream 8.3 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2019-06-26 04:57:52 BUG #15874: How to Install Plug-ins on Windows System
Previous Message Michael Paquier 2019-06-26 02:26:42 Re: BUG #15872: copy command does not skip special character