Re: Declarative partitioning - another take

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Dmitry Ivanov <d(dot)ivanov(at)postgrespro(dot)ru>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Erik Rijkers <er(at)xs4all(dot)nl>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: Declarative partitioning - another take
Date: 2017-01-10 11:06:45
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2017/01/05 5:50, Robert Haas wrote:
> On Tue, Dec 27, 2016 at 3:59 AM, Amit Langote
> <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> Patches 0001 to 0006 unchanged.
> Committed 0001 earlier, as mentioned in a separate email. Committed
> 0002 and part of 0003.

Thanks! I realized however that the approach I used in 0002 of passing
the original slot to ExecConstraints() fails in certain situations. For
example, if a BR trigger changes the tuple, the original slot would not
receive those changes, so it will be wrong to use such a tuple anymore.
In attached 0001, I switched back to the approach of converting the
partition-tupdesc-based tuple back to the root partitioned table's format.
The converted tuple contains the changes by BR triggers, if any. Sorry
about some unnecessary work.

> But I'm skeptical that the as-patched-by-0003
> logic in generate_partition_qual() makes sense. You do this:
> result = list_concat(generate_partition_qual(parent),
> copyObject(rel->rd_partcheck));
> /* Mark Vars with correct attnos */
> result = map_partition_varattnos(result, rel, parent);
> But that has the effect of applying map_partition_varattnos to
> everything in rel->rd_partcheck in addition to applying it to
> everything returned by generate_partition_qual() on the parent, which
> doesn't seem right.

I've replaced this portion of the code with (as also mentioned below):

/* Quick copy */
if (rel->rd_partcheck != NIL)
return copyObject(rel->rd_partcheck);

Down below (for the case when the partition qual is not cached, we now do

my_qual = get_qual_from_partbound(rel, parent, bound);

/* Add the parent's quals to the list (if any) */
if (parent->rd_rel->relispartition)
result = list_concat(generate_partition_qual(parent), my_qual);
result = my_qual;

* Change Vars to have partition's attnos instead of the parent's.
* We do this after we concatenate the parent's quals, because
* we want every Var in it to bear this relation's attnos.
result = map_partition_varattnos(result, rel, parent);

Which is then cached wholly in rd_partcheck.

As for your concern whether it's correct to do so, consider that doing
generate_partition_qual() on the parent returns qual with Vars that bear
the parent's attnos (which is OK as far parent as partition is concerned).
To apply the qual to the current relation as partition, we must change
the Vars to have this relation's attnos.

> Also, don't we want to do map_partition_varattnos() just ONCE, rather
> than on every call to this function? I think maybe your concern is
> that the parent might be changed without a relcache flush on the
> child, but I don't quite see how that could happen. If the parent's
> tuple descriptor changes, surely the child's tuple descriptor would
> have to be altered at the same time...

Makes sense. I fixed so that we return copyObject(rel->rd_partcheck), if
it's non-NIL, instead of generating parent's qual and doing the mapping
again. For some reason, I thought we couldn't save the mapped version in
the relcache.

By the way, in addition to the previously mentioned bug of RETURNING, I
found that WITH CHECK OPTION didn't work correctly as well. In fact
automatically updatable views failed to consider partitioned tables at
all. Patch 0007 is addressed towards fixing that.


Attachment Content-Type Size
0001-Fix-reporting-of-violation-in-ExecConstraints-again.patch text/x-diff 9.9 KB
0002-Fix-a-bug-in-how-we-generate-partition-constraints.patch text/x-diff 10.7 KB
0003-Fix-a-bug-of-insertion-into-an-internal-partition.patch text/x-diff 6.6 KB
0004-Add-some-more-tests-for-tuple-routing.patch text/x-diff 5.2 KB
0005-Avoid-tuple-coversion-in-common-partitioning-cases.patch text/x-diff 6.0 KB
0006-Fix-RETURNING-to-work-correctly-after-tuple-routing.patch text/x-diff 9.8 KB
0007-Fix-some-issues-with-views-and-partitioned-tables.patch text/x-diff 7.1 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-01-10 11:22:23 Re: Radix tree for character conversion
Previous Message Craig Ringer 2017-01-10 10:35:07 Re: proposal: session server side variables