Re: Adding support for Default partition in partitioning

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Keith Fiske <keith(at)omniti(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Adding support for Default partition in partitioning
Date: 2017-05-02 16:03:03
Message-ID: CAH2L28u1gnddgSTFG=_XR9KBZ4pvT-jaqsxhsQNwV48cys0ZaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Please find attached updated patch with review comments by Robert and
Jeevan implemented.

The newly proposed syntax
CREATE TABLE .. PARTITION OF .. DEFAULT has got most votes on this thread.

If there is no more objection, I will go ahead and include that in the
patch.

Thank you,
Rahila Syed

On Mon, Apr 24, 2017 at 2:40 PM, Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:

> Hello,
>
> Thank you for reviewing.
>
> >But that's not a good idea for several reasons. For one thing, you
> >can also write FOR VALUES IN (DEFAULT, 5) or which isn't sensible.
> >For another thing, this kind of syntax won't generalize to range
> >partitioning, which we've talked about making this feature support.
> >Maybe something like:
>
> >CREATE TABLE .. PARTITION OF .. DEFAULT;
>
> I agree that the syntax should be changed to also support range
> partitioning.
>
> Following can also be considered as it specifies more clearly that the
> partition holds default values.
>
> CREATE TABLE ...PARTITION OF...FOR VALUES DEFAULT;
>
> >Maybe we should introduce a dedicated node type to
> >represent a default-specification in the parser grammar. If not, then
> >let's at least encapsulate the test a little better, e.g. by adding
> >isDefaultPartitionBound() which tests not only IsA(..., DefElem) but
> >also whether the name is DEFAULT as expected. BTW, we typically use
> >lower-case internally, so if we stick with this representation it
> >should really be "default" not "DEFAULT".
>
> isDefaultPartitionBound() function is created in the attached patch which
> checks for both node type and name.
>
> >Why abbreviate "default" to def here? Seems pointless.
> Corrected in the attached.
>
> >Consider &&
> Fixed.
>
> >+ * default partiton for rows satisfying the new partition
> >Spelling.
> Fixed.
>
> >Missing apostrophe
> Fixed.
>
> >Definitely not safe against concurrency, since AccessShareLock won't
> >exclude somebody else's update. In fact, it won't even cover somebody
> >else's already-in-flight transaction
> Changed it to AccessExclusiveLock
>
> >Normally in such cases we try to give more detail using
> >ExecBuildSlotValueDescription.
> This function is used in execMain.c and the error is being
> reported in partition.c.
> Do you mean the error reporting should be moved into execMain.c
> to use ExecBuildSlotValueDescription?
>
> >This variable starts out true and is never set to any value other than
> >true. Just get rid of it and, in the one place where it is currently
> >used, write "true". That's shorter and clearer.
> Fixed.
>
> >There's not really a reason to cast the result of stringToNode() to
> >Node * and then turn around and cast it to PartitionBoundSpec *. Just
> >cast it directly to whatever it needs to be. And use the new castNode
> >macro
> Fixed. castNode macro takes as input Node * whereas stringToNode() takes
> string.
> IIUC, castNode cant be used here.
>
> >The if (def_elem) test continues
> >early, but if the point is that the loop using cell3 shouldn't execute
> >in that case, why not just put if (!def_elem) { foreach(cell3, ...) {
> >... } } instead of reiterating the ReleaseSysCache in two places?
> Fixed in the attached.
>
> I will respond to further comments in following email.
>
>
> On Thu, Apr 13, 2017 at 12:48 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
>
>> On Thu, Apr 6, 2017 at 7:30 AM, Rahila Syed <rahilasyed90(at)gmail(dot)com>
>> wrote:
>> > Thanks a lot for testing and reporting this. Please find attached an
>> updated
>> > patch with the fix. The patch also contains a fix
>> > regarding operator used at the time of creating expression as default
>> > partition constraint. This was notified offlist by Amit Langote.
>>
>> I think that the syntax for this patch should probably be revised.
>> Right now the proposal is for:
>>
>> CREATE TABLE .. PARTITION OF ... FOR VALUES IN (DEFAULT);
>>
>> But that's not a good idea for several reasons. For one thing, you
>> can also write FOR VALUES IN (DEFAULT, 5) or which isn't sensible.
>> For another thing, this kind of syntax won't generalize to range
>> partitioning, which we've talked about making this feature support.
>> Maybe something like:
>>
>> CREATE TABLE .. PARTITION OF .. DEFAULT;
>>
>> This patch makes the assumption throughout that any DefElem represents
>> the word DEFAULT, which is true in the patch as written but doesn't
>> seem very future-proof. I think the "def" in "DefElem" stands for
>> "definition" or "define" or something like that, so this is actually
>> pretty confusing. Maybe we should introduce a dedicated node type to
>> represent a default-specification in the parser grammar. If not, then
>> let's at least encapsulate the test a little better, e.g. by adding
>> isDefaultPartitionBound() which tests not only IsA(..., DefElem) but
>> also whether the name is DEFAULT as expected. BTW, we typically use
>> lower-case internally, so if we stick with this representation it
>> should really be "default" not "DEFAULT".
>>
>> Useless hunk:
>>
>> + bool has_def; /* Is there a default partition?
>> Currently false
>> + * for a range partitioned table */
>> + int def_index; /* Index of the default list
>> partition. -1 for
>> + * range partitioned tables */
>>
>> Why abbreviate "default" to def here? Seems pointless.
>>
>> + if (found_def)
>> + {
>> + if (mapping[def_index] == -1)
>> + mapping[def_index] = next_index++;
>> + }
>>
>> Consider &&
>>
>> @@ -717,7 +754,6 @@ check_new_partition_bound(char *relname, Relation
>> parent, Node *bound)
>> }
>> }
>> }
>> -
>> break;
>> }
>>
>> + * default partiton for rows satisfying the new partition
>>
>> Spelling.
>>
>> + * constraint. If found dont allow addition of a new partition.
>>
>> Missing apostrophe.
>>
>> + defrel = heap_open(defid, AccessShareLock);
>> + tupdesc = CreateTupleDescCopy(RelationGetDescr(defrel));
>> +
>> + /* Build expression execution states for partition check quals */
>> + partqualstate = ExecPrepareCheck(partConstraint,
>> + estate);
>> +
>> + econtext = GetPerTupleExprContext(estate);
>> + snapshot = RegisterSnapshot(GetLatestSnapshot());
>>
>> Definitely not safe against concurrency, since AccessShareLock won't
>> exclude somebody else's update. In fact, it won't even cover somebody
>> else's already-in-flight transaction.
>>
>> + errmsg("new default partition constraint is violated
>> by some row")));
>>
>> Normally in such cases we try to give more detail using
>> ExecBuildSlotValueDescription.
>>
>> + bool is_def = true;
>>
>> This variable starts out true and is never set to any value other than
>> true. Just get rid of it and, in the one place where it is currently
>> used, write "true". That's shorter and clearer.
>>
>> + inhoids = find_inheritance_children(RelationGetRelid(parent),
>> NoLock);
>>
>> If it's actually safe to do this with no lock, there ought to be a
>> comment with a very compelling explanation of why it's safe.
>>
>> + boundspec = (Node *) stringToNode(TextDatumGetCString(datum));
>> + bspec = (PartitionBoundSpec *)boundspec;
>>
>> There's not really a reason to cast the result of stringToNode() to
>> Node * and then turn around and cast it to PartitionBoundSpec *. Just
>> cast it directly to whatever it needs to be. And use the new castNode
>> macro.
>>
>> + foreach(cell1, bspec->listdatums)
>> + {
>> + Node *value = lfirst(cell1);
>> + if (IsA(value, DefElem))
>> + {
>> + def_elem = true;
>> + *defid = inhrelid;
>> + }
>> + }
>> + if (def_elem)
>> + {
>> + ReleaseSysCache(tuple);
>> + continue;
>> + }
>> + foreach(cell3, bspec->listdatums)
>> + {
>> + Node *value = lfirst(cell3);
>> + boundspecs = lappend(boundspecs, value);
>> + }
>> + ReleaseSysCache(tuple);
>> + }
>> + foreach(cell4, spec->listdatums)
>> + {
>> + Node *value = lfirst(cell4);
>> + boundspecs = lappend(boundspecs, value);
>> + }
>>
>> cell1, cell2, cell3, and cell4 are not very clear variable names.
>> Between that and the lack of comments, this is not easy to understand.
>> It's sort of spaghetti logic, too. The if (def_elem) test continues
>> early, but if the point is that the loop using cell3 shouldn't execute
>> in that case, why not just put if (!def_elem) { foreach(cell3, ...) {
>> ... } } instead of reiterating the ReleaseSysCache in two places?
>>
>> + /* Collect bound spec nodes in a list. This is done
>> if the partition is
>> + * a default partition. In case of default partition,
>> constraint is formed
>> + * by performing <> operation over the partition
>> constraints of the
>> + * existing partitions.
>> + */
>>
>> I doubt that handles NULLs properly.
>>
>> + inhoids =
>> find_inheritance_children(RelationGetRelid(parent), NoLock);
>>
>> Again, no lock? Really?
>>
>> The logic which follows looks largely cut-and-pasted, which makes me
>> think you need to do some refactoring here to make it more clear
>> what's going on, so that you have the relevant logic in just one
>> place. It seems wrong anyway to shove all of this logic specific to
>> the default case into get_qual_from_partbound() when the logic for the
>> non-default case is inside get_qual_for_list. Where there were 2
>> lines of code before you've now got something like 30.
>>
>> + if(get_negator(operoid) == InvalidOid)
>> + elog(ERROR, "no negator found for partition operator %u",
>> + operoid);
>>
>> I really doubt that's OK. elog() shouldn't be reachable, but this
>> will be reachable if the partitioning operator does not have a
>> negator. And there's the NULL-handling issue I mentioned above, too.
>>
>> + if (partdesc->boundinfo->has_def && key->strategy
>> + == PARTITION_STRATEGY_LIST)
>> + result = parent->indexes[partdesc->boun
>> dinfo->def_index];
>>
>> Testing for PARTITION_STRATEGY_LIST here seems unnecessary. If
>> has_def (or has_default, as it probably should be) isn't allowed for
>> range partitions, then it's redundant; if it is allowed, then that
>> case should be handled too. Also, at this point we've already set
>> *failed_at and *failed_slot; presumably you'd want to make this check
>> before you get to that point.
>>
>> I suspect there are quite a few more problems here in addition to the
>> ones mentioned above, but I don't think it makes sense to spend too
>> much time searching for them until some of this basic stuff is cleaned
>> up.
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>
>

Attachment Content-Type Size
default_partition_v8.patch application/x-download 20.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-05-02 16:06:18 Re: Potential hot-standby bug around xacts committed but in xl_running_xacts
Previous Message Thomas Kellerer 2017-05-02 16:01:14 Re: CTE inlining