Re: BUG #17076: Server crashes on composing an error message about invalid modulus for a new table partition

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, exclusion(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17076: Server crashes on composing an error message about invalid modulus for a new table partition
Date: 2021-06-29 07:11:51
Message-ID: CA+HiwqF+keTbzQqYPR=QtzNon+huzCRcb676eb66NTqPYDARtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Jun 29, 2021 at 7:30 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
> > I think the patch has got more problems than that too, as it's far
> > from clear why the next remainder would have anything to do with the
> > next larger modulus. IOW I suspect that when it does manage to print
> > a partition name without crashing, it's very likely to print the
> > wrong partition.
>
> Concretely, this variant of the test case:
>
> drop table hash_parted;
> CREATE TABLE hash_parted (
> a int
> ) PARTITION BY HASH (a);
> CREATE TABLE hpart_1 PARTITION OF hash_parted FOR VALUES WITH (MODULUS 50, REMAINDER 0);
> CREATE TABLE hpart_2 PARTITION OF hash_parted FOR VALUES WITH (MODULUS 10, REMAINDER 1);
> CREATE TABLE hpart_3 PARTITION OF hash_parted FOR VALUES WITH (MODULUS 200, REMAINDER 2);
>
> CREATE TABLE fail_part PARTITION OF hash_parted FOR VALUES WITH (MODULUS 25, REMAINDER 3);
>
> prints
>
> ERROR: every hash partition modulus must be a factor of the next larger modulus
> DETAIL: The new modulus 25 is not divisible by 10, the modulus of existing partition "hpart_1".
>
> which is obviously wrong.

Yes. Though, I think the problem is that the code uses
boundinfo->indexes[<offset>] as an index into partdesc->oids[] array
to get the existing partition to show in the error message. The
correct thing here would be to use <offset> directly. That is because
the hash partition bounds as laid out in boudinfo->datums[] correspond
one-to-one with the partition OIDs laid out in partdesc->oids[].
boundinfo->indexes should only be consulted to get the partition OID
from a remainder value, as get_partition_for_tuple() does.

(OTOH, range and list partition bound offsets, because there may be
more than one bound assigned to each partition (considering
de-duplication of pairs of matching upper and next lower bound in the
range case), do not correspond one-to-one with those of partition
OIDs.)

> I think that rescuing this might require scanning through the list of
> partitions for one that has the desired modulus. There could be more
> than one match, but I'd be content to take the first one.

Hmm, the bound offset that the code comes up with to show in the error
message seems fine to me.

I have attached a patch containing the above mentioned fix. The patch
also adjusts the relevant test cases in create_table.sql to cover
these cases.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
hashpart-bound-error-wrong-partname-fix.patch application/octet-stream 5.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2021-06-29 18:36:52 Re: BUG #17076: Server crashes on composing an error message about invalid modulus for a new table partition
Previous Message Bruce Momjian 2021-06-29 01:02:41 Re: BUG #17073: docs - "Improve signal handling reliability"