Re: Adding support for Default partition in partitioning

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
Cc: Beena Emerson <memissemerson(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Adding support for Default partition in partitioning
Date: 2017-06-12 04:09:25
Message-ID: CAFjFpRfvrgZxT8ycSYGNb23nNgiSh6+JT+D57F-sQ1kXey7h5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

While the refactoring seems a reasonable way to re-use existing code,
that may change based on the discussion in [1]. Till then please keep
the refactoring patches separate from the main patch. In the final
version, I think the refactoring changes to ATAttachPartition and the
default partition support should be committed separately. So, please
provide three different patches. That also makes review easy.

On Mon, Jun 12, 2017 at 8:29 AM, Jeevan Ladhe
<jeevan(dot)ladhe(at)enterprisedb(dot)com> wrote:
> Hi Ashutosh,
>
> I tried to look into your refactoring code.
> When applied all 3 patches, I got some regression failures, I have fixed all
> of
> them now in attached patches, attached the regression.diffs.
>
> Moving further, I have also made following changes in attached patches:
>
> 1. 0001-Refactor-ATExecAttachPartition.patch
>
> + * There is a case in which we cannot rely on just the result of the
> + * proof
> This comment seems to also exist in current code, and I am not able to
> follow
> which case this refers to. But, IIUC, this comment is for the case where we
> are
> handling the 'key IS NOT NULL' part separately, and if that is the case it
> is
> not needed here in the prologue of the function.
>
> attachPartCanSkipValidation
> +static bool
> +ATCheckValidationSkippable(Relation scanRel, List *partConstraint,
> + PartitionKey key)
> The function name ATCheckValidationSkippable does not sound very intuitive
> to me,
> and also I think prefix AT is something does not fit here as the function is
> not
> really directly related to alter table command, instead is an auxiliary
> function.
> How about changing it to "attachPartitionRequiresScan" or
> "canSkipPartConstraintValidation"
>
> + List *existConstraint = NIL;
> Needs to be moved to inside if block instead.
>
> + bool skip_validate;
> Needs to be initialized to false, otherwise it can be returned without
> initialization when scanRel_constr is NULL.
>
> + if (scanRel_constr != NULL)
> instead of this may be we can simply have:
> + if (scanRel_constr == NULL)
> + return false;
> This can prevent further indentation.
>
> +static void
> +ATValidatePartitionConstraints(List **wqueue, Relation scanRel,
> + List *partConstraint, Relation rel)
> What about just validatePartitionConstraints()
>
> + bool skip_validate = false;
> +
> + /* Check if we can do away with having to scan the table being attached.
> */
> + skip_validate = ATCheckValidationSkippable(scanRel, partConstraint,
> key);
>
> First assignment is unnecessary here.
>
> Instead of:
> /* Check if we can do away with having to scan the table being attached. */
> skip_validate = ATCheckValidationSkippable(scanRel, partConstraint, key);
>
> /* It's safe to skip the validation scan after all */
> if (skip_validate)
> ereport(INFO,
> (errmsg("partition constraint for table \"%s\" is implied by existing
> constraints",
> RelationGetRelationName(scanRel))));
>
> Following change can prevent further indentation:
> if (ATCheckValidationSkippable(scanRel, partConstraint, key))
> {
> ereport(INFO,
> (errmsg("partition constraint for table \"%s\" is implied by existing
> constraints",
> RelationGetRelationName(scanRel))));
> return;
> }
> This way variable skip_validate will not be needed.
>
> Apart from this, I see that the patch will need change depending on how the
> fix
> for validating partition constraints in case of embedded NOT-NULL[1] shapes
> up.
>
> 2. 0003-Refactor-default-partitioning-patch-to-re-used-code.patch
>
> + * In case the new partition bound being checked itself is a DEFAULT
> + * bound, this check shouldn't be triggered as there won't already exists
> + * the default partition in such a case.
> I think above comment in DefineRelation() is not applicable, as
> check_default_allows_bound() is called unconditional, and the check for
> existence
> of default partition is now done inside the check_default_allows_bound()
> function.
>
> * This function checks if there exists a row in the default partition that
> * fits in the new partition and throws an error if it finds one.
> */
> Above comment for check_default_allows_bound() needs a change now, may be
> something like this:
> * This function checks if a default partition already exists and if it
> does
> * it checks if there exists a row in the default partition that fits in
> the
> * new partition and throws an error if it finds one.
> */
>
> List *new_part_constraints = NIL;
> List *def_part_constraints = NIL;
> I think above initialization is not needed, as the further assignments are
> unconditional.
>
> + if (OidIsValid(default_oid))
> + {
> + Relation default_rel = heap_open(default_oid, AccessExclusiveLock);
> We already have taken a lock on default and here we should be using a NoLock
> instead.
>
> + def_part_constraints =
> get_default_part_validation_constraint(new_part_constraints);
> exceeds 80 columns.
>
> + defPartConstraint =
> get_default_part_validation_constraint(partBoundConstraint);
> similarly, needs indentation.
>
> +
> +List *
> +get_default_part_validation_constraint(List *new_part_constraints)
> +{
> Needs some comment. What about:
> /*
> * get_default_part_validation_constraint
> *
> * Given partition constraints, this function returns *would be* default
> * partition constraint.
> */
>
> Apart from this, I tried to address the differences in error shown in case
> of
> attache and create partition when rows in default partition would violate
> the
> updated constraints, basically I have taken a flag in AlteredTableInfo to
> indicate if the relation being scanned is a default partition or a child of
> default partition(which I dint like much, but I don't see a way out here).
> Still
> the error message does not display the default partition name in error as of
> check_default_allows_bound(). May be to address this and keep the messages
> exactly similar we can copy the name of parent default partition in a field
> in
> AlteredTableInfo structure, which looks very ugly to me. I am open to
> suggestions here.
>
> 3. changes to default_partition_v19.patch:
>
> The default partition constraint are no more built using the negator of the
> operator, instead it is formed simply as NOT of the existing partitions:
> e.g.:
> if a null accepting partition already exists:
> NOT ((keycol IS NULL) OR (keycol = ANY (arr)))
> if a null accepting partition does not exists:
> NOT ((keycol IS NOT NULL) AND (keycol = ANY (arr))), where arr is an array
> of
> datums in boundinfo->datums.
>
> Added tests for prepared statment.
>
> Renamed RelationGetDefaultPartitionOid() to get_default_partition_oid().
>
> + if (partqualstate && ExecCheck(partqualstate, econtext))
> + ereport(ERROR,
> + (errcode(ERRCODE_CHECK_VIOLATION),
> + errmsg("new partition constraint for default partition \"%s\" would be
> violated by some row",
> + RelationGetRelationName(default_rel))));
> Per Ashutosh's suggestion[2], changed the error code to
> ERRCODE_INVALID_OBJECT_DEFINITION.
> Also, per Robert's suggestion[3], changed following message:
> "new partition constraint for default partition \"%s\" would be violated by
> some row"
> to
> "updated partition constraint for default partition \"%s\" would be violated
> by some row"
>
> Some other cosmetic changes.
>
> Apart from this, I am exploring the tests in relation with NOT NULL
> constraint
> embedded within an expression. Will update on that shortly.
>
> [1]http://www.postgresql-archive.org/A-bug-in-mapping-attributes-in-ATExecAttachPartition-td5965298.html
> [2]http://www.postgresql-archive.org/Adding-support-for-Default-partition-in-partitioning-td5946868i120.html#a5965277
> [3]http://www.postgresql-archive.org/Adding-support-for-Default-partition-in-partitioning-tp5946868p5965599.html
>
> Regards,
> Jeevan Ladhe
>
>
> On Thu, Jun 8, 2017 at 2:54 PM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>
>> On Wed, Jun 7, 2017 at 2:08 AM, Jeevan Ladhe
>> <jeevan(dot)ladhe(at)enterprisedb(dot)com> wrote:
>>
>> >
>> >>
>> >> The code in check_default_allows_bound() to check whether the default
>> >> partition
>> >> has any rows that would fit new partition looks quite similar to the
>> >> code
>> >> in
>> >> ATExecAttachPartition() checking whether all rows in the table being
>> >> attached
>> >> as a partition fit the partition bounds. One thing that
>> >> check_default_allows_bound() misses is, if there's already a constraint
>> >> on
>> >> the
>> >> default partition refutes the partition constraint on the new
>> >> partition,
>> >> we can
>> >> skip the scan of the default partition since it can not have rows that
>> >> would
>> >> fit the new partition. ATExecAttachPartition() has code to deal with a
>> >> similar
>> >> case i.e. the table being attached has a constraint which implies the
>> >> partition
>> >> constraint. There may be more cases which check_default_allows_bound()
>> >> does not
>> >> handle but ATExecAttachPartition() handles. So, I am wondering whether
>> >> it's
>> >> better to somehow take out the common code into a function and use it.
>> >> We
>> >> will
>> >> have to deal with a difference through. The first one would throw an
>> >> error
>> >> when
>> >> finding a row that satisfies partition constraints whereas the second
>> >> one
>> >> would
>> >> throw an error when it doesn't find such a row. But this difference can
>> >> be
>> >> handled through a flag or by negating the constraint. This would also
>> >> take
>> >> care
>> >> of Amit Langote's complaint about foreign partitions. There's also
>> >> another
>> >> difference that the ATExecAttachPartition() queues the table for scan
>> >> and
>> >> the
>> >> actual scan takes place in ATRewriteTable(), but there is not such
>> >> queue
>> >> while
>> >> creating a table as a partition. But we should check if we can reuse
>> >> the
>> >> code to
>> >> scan the heap for checking a constraint.
>> >>
>> >> In case of ATTACH PARTITION, probably we should schedule scan of
>> >> default
>> >> partition in the alter table's work queue like what
>> >> ATExecAttachPartition() is
>> >> doing for the table being attached. That would fit in the way alter
>> >> table
>> >> works.
>> >
>>
>> I tried refactoring existing code so that it can be used for default
>> partitioning as well. The code to validate the partition constraints
>> against the table being attached in ATExecAttachPartition() is
>> extracted out into a set of functions. For default partition we reuse
>> those functions to check whether it contains any row that would fit
>> the partition being attached. While creating a new partition, the
>> function to skip validation is reused but the scan portion is
>> duplicated from ATRewriteTable since we are not in ALTER TABLE
>> context. The names of the functions, their declaration will require
>> some thought.
>>
>> There's one test failing because for ATTACH partition the error comes
>> from ATRewriteTable instead of check_default_allows_bounds(). May be
>> we want to use same message in both places or some make ATRewriteTable
>> give a different message while validating default partition.
>>
>> Please review the patch and let me know if the changes look good.
>
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vinayak 2017-06-12 04:09:39 Re: ECPG: WHENEVER statement with DO CONTINUE action
Previous Message Thomas Munro 2017-06-12 04:04:23 Re: RTE_NAMEDTUPLESTORE, enrtuples and comments