Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.
Date: 2018-10-05 06:06:52
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thanks for the report.

On 2018/10/04 3:58, Rajkumar Raghuwanshi wrote:
> Hi,
> I am getting ERROR: null relpartbound for relation 18159 while doing
> pg_upgrade from v11 to v11/master.
> -- user-defined operator class in partition key
> CREATE FUNCTION my_int4_sort(int4,int4) RETURNS int LANGUAGE sql
> AS $$ SELECT CASE WHEN $1 = $2 THEN 0 WHEN $1 > $2 THEN 1 ELSE -1 END; $$;
> CREATE OPERATOR CLASS test_int4_ops FOR TYPE int4 USING btree AS
> OPERATOR 1 < (int4,int4), OPERATOR 2 <= (int4,int4),
> OPERATOR 3 = (int4,int4), OPERATOR 4 >= (int4,int4),
> OPERATOR 5 > (int4,int4), FUNCTION 1 my_int4_sort(int4,int4);
> CREATE TABLE partkey_t (a int4) PARTITION BY RANGE (a test_int4_ops);
> (1000);
> INSERT INTO partkey_t VALUES (100);
> INSERT INTO partkey_t VALUES (200);
> --ran pg_upgrade failed with below error.
> pg_restore: [archiver (db)] could not execute query: ERROR: null
> relpartbound for relation 18159
> CONTEXT: SQL function "my_int4_sort"

Interesting test case.

To reproduce, the following works too (after creating the objects as
described above):

alter table partkey_t detach partition partkey_t_1;
alter table partkey_t attach partition partkey_t_1 for values from (0) to
ERROR: null relpartbound for relation 16396
CONTEXT: SQL function "my_int4_sort"

The stack at the time of the error:

(gdb) bt
#0 RelationBuildPartitionDesc
#1 0x00000000009bf04e in RelationBuildDesc
#2 0x00000000009c1784 in RelationClearRelation
#3 0x00000000009c1cc5 in RelationFlushRelation
#4 0x00000000009c1dd9 in RelationCacheInvalidateEntry
#5 0x00000000009b9496 in LocalExecuteInvalidationMessage
#6 0x00000000009b91ec in ProcessInvalidationMessages
#7 0x00000000009b9cdb in CommandEndInvalidationMessages
#8 0x00000000005346ef in AtCCI_LocalCache
#9 0x0000000000534124 in CommandCounterIncrement
#10 0x00000000006c579d in fmgr_sql
#11 0x00000000009de7c2 in FunctionCall2Coll
#12 0x000000000058ac9f in partition_rbound_cmp
#13 0x0000000000588059 in check_new_partition_bound
#14 0x000000000067f536 in ATExecAttachPartition

So, the CommandCounterIncrement done in fmgr_sql causes partkey_t's
PartitionDesc to be recomputed, which counts partkey_t_1 as its child
because ATExecAttachPartition has already finished CreateInheritance which
would've sent out an invalidation message for partkey_t.

As of commit 2fbdf1b38bc [1], which has been applied in 11 and HEAD
branches, RelationBuildPartitionDesc emits an error if we don't find
relpartbound set for a child found by scanning pg_inherits, instead of
skipping such children. While that commit switched the order of creating
pg_inherits entry and checking a new bound against existing bounds in
DefineRelation in light of aforementioned change, it didn't in
ATExecAttachPartition, hence this error.

Attached patch fixes that.

I thought we'd need to apply this to 10, 11, HEAD, but I couldn't
reproduce this in 10. That's because the above commit wasn't applied to
10, so the child that causes this error is being skipped in 10's case.

Maybe, we should apply parts of the above commit that apply to 10 and then
this patch on top. Attached for-10.patch file does that.



Attachment Content-Type Size
ATExecAttachPartition-create-inheritance-after-checking-bound.patch text/plain 1.1 KB
for-10.patch text/plain 3.8 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-10-05 06:31:49 Re: partition tree inspection functions
Previous Message Pavel Stehule 2018-10-05 06:05:43 Re: partition tree inspection functions