bug with expression index on partition

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: bug with expression index on partition
Date: 2018-06-21 06:35:50
Message-ID: dce1fda4-e0f0-94c9-6abb-f5956a98c057@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi.

I noticed a bug with creating an expression index on a partitioned table.
The fact that a partition may have differing attribute numbers is not
accounted for when recursively *creating* the index on partition. The
partition index gets created but is unusable.

create table p (a int) partition by hash (a);
create table p1 partition of p for values with (modulus 3, remainder 0);
create table p2 partition of p for values with (modulus 3, remainder 1);
create table p3 partition of p for values with (modulus 3, remainder 2);

-- make p3 have different attribute number for column a
alter table p detach partition p3;
alter table p3 drop a, add a int;
alter table p attach partition p3 for values with (modulus 3, remainder 2);

create index on p ((a+1));
\d p3
ERROR: invalid attnum 1 for relation "p3"

That's because the IndexStmt that's passed for creating the index on p3
contains expressions whose Vars are exact copies of the parent's Vars,
which in this case is wrong. We must have converted them to p3's
attribute numbers before calling DefineIndex on p3.

Note that if the same index already exists in a partition before creating
the parent's index, then that index already contains the right Vars, no
conversion is needed in that case.

alter table p detach partition p3;
alter table p3 drop a, add a int;
create index on p3 ((a+1));
alter table p attach partition p3 for values with (modulus 3, remainder 2);
\d p3
Table "public.p3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Partition of: p FOR VALUES WITH (modulus 3, remainder 2)
Indexes:
"p3_expr_idx" btree ((a + 1))

create index on p ((a+1));
\d p3
Table "public.p3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Partition of: p FOR VALUES WITH (modulus 3, remainder 2)
Indexes:
"p3_expr_idx" btree ((a + 1))

In other words, the code that checks whether an index with matching
definition exists in the partition correctly converts Vars before the
actual matching, so is able to conclude that, despite different Vars in
p3's index, it is same index as the one being created in the parent. So,
the index is not created again, which actually if it were, it would be hit
by the very same bug I'm reporting.

Also is right the case where the partition being attached doesn't have the
index but the parent has and it is correctly *cloned* to the attached
partition.

alter table p detach partition p3;
alter table p3 drop a, add a int;
create index on p ((a+1));
alter table p attach partition p3 for values with (modulus 3, remainder 2);
\d p3
Table "public.p3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Partition of: p FOR VALUES WITH (modulus 3, remainder 2)
Indexes:
"p3_expr_idx" btree ((a + 1))

So, CompareIndexInfo and generateClonedIndexStmt are both doing the right
thing, but DefineIndex is not. Attached is a patch to fix DefineIndex so
that it converts indexParams before recursing to create the index on a
partition.

By the way, do we want to do anything about the fact that we don't check
if a matching inherited index exists when creating an index directly on
the partition. I wondered this because we do check the other way around.

\d p3
Table "public.p3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Partition of: p FOR VALUES WITH (modulus 3, remainder 2)
Indexes:
"p3_expr_idx" btree ((a + 1))

create index on p3 ((a+1));
\d p3
Table "public.p3"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Partition of: p FOR VALUES WITH (modulus 3, remainder 2)
Indexes:
"p3_expr_idx" btree ((a + 1))
"p3_expr_idx1" btree ((a + 1))

Thanks,
Amit

Attachment Content-Type Size
v1-0001-Convert-indexParams-to-partition-s-attnos-before-.patch text/plain 5.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2018-06-21 07:12:17 Re: libpq compression
Previous Message Michael Paquier 2018-06-21 06:32:01 Re: Fix some error handling for read() and errno