Re: Declarative partitioning - another take

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Dmitry Ivanov <d(dot)ivanov(at)postgrespro(dot)ru>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Erik Rijkers <er(at)xs4all(dot)nl>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: Declarative partitioning - another take
Date: 2017-01-20 02:58:25
Message-ID: 9012a140-f40a-6f6b-6525-8270e57bd399@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/01/20 4:18, Robert Haas wrote:
> On Thu, Jan 19, 2017 at 12:15 AM, Amit Langote wrote:
>> 0002-Set-ecxt_scantuple-correctly-for-tuple-routing.patch
>>
>> In 2ac3ef7a01df859c62d0a02333b646d65eaec5ff, we changed things so that
>> it's possible for a different TupleTableSlot to be used for partitioned
>> tables at successively lower levels. If we do end up changing the slot
>> from the original, we must update ecxt_scantuple to point to the new one
>> for partition key of the tuple to be computed correctly.
>>
>> Last posted here:
>> https://www.postgresql.org/message-id/99fbfad6-b89b-9427-a6ca-197aad98c48e%40lab.ntt.co.jp
>
> Why does FormPartitionKeyDatum need this? Could that be documented in
> a comment in here someplace, perhaps the header comment to
> FormPartitionKeyDatum?

FormPartitionKeyDatum() computes partition key expressions (if any) and
the expression evaluation machinery expects ecxt_scantuple to point to the
tuple to extract attribute values from.

FormPartitionKeyDatum() already has a tiny comment, which it seems is the
only thing we could say here about this there:

* the ecxt_scantuple slot of estate's per-tuple expr context must point to
* the heap tuple passed in.

In get_partition_for_tuple() which calls FormPartitionKeyDatum(), the
patch adds this comment (changed it a little from the last version):

+ /*
+ * Extract partition key from tuple. Expression evaluation machinery
+ * that FormPartitionKeyDatum() invokes expects ecxt_scantuple to
+ * point to the correct tuple slot. The slot might have changed from
+ * what was used for the parent table if the table of the current
+ * partitioning level has different tuple descriptor from the parent.
+ * So update ecxt_scantuple accordingly.
+ */
+ ecxt->ecxt_scantuple = slot;
FormPartitionKeyDatum(parent, slot, estate, values, isnull);

It says why we need to change which slot ecxt_scantuple points to.

>> 0004-Fix-some-issues-with-views-and-partitioned-tables.patch
>>
>> Automatically updatable views failed to handle partitioned tables.
>> Once that's fixed, WITH CHECK OPTIONS wouldn't work correctly without
>> the WCO expressions having been suitably converted for each partition
>> (think applying map_partition_varattnos to Vars in the WCO expressions
>> just like with partition constraint expressions).
>
> The changes to execMain.c contain a hunk which has essentially the
> same code twice. That looks like a mistake. Also, the patch doesn't
> compile because convert_tuples_by_name() takes 3 arguments, not 4.

Actually, I realized that and sent the updated version [1] of this patch
that fixed this issue. In the updated version, I removed that code block
(the 2 copies of it), because we are still discussing what to do about
showing tuples in constraint violation (in this case, WITH CHECK OPTION
violation) messages. Anyway, attached here again.

>> 0006-Avoid-tuple-coversion-in-common-partitioning-cases.patch
>>
>> Currently, the tuple conversion is performed after a tuple is routed,
>> even if the attributes of a target leaf partition map one-to-one with
>> those of the root table, which is wasteful. Avoid that by making
>> convert_tuples_by_name() return a NULL map for such cases.
>
> + Assert(!consider_typeid && indesc->tdhasoid == outdesc->tdhasoid);
>
> I think you mean Assert(consider_typeid || indesc->tdhasoid ==
> outdesc->tdhasoid);

Ah, you're right.

> But I wonder why we don't instead just change this function to
> consider tdhasoid rather than tdtypeid. I mean, if the only point of
> comparing the type OIDs is to find out whether the table-has-OIDs
> setting matches, we could instead test that directly and avoid needing
> to pass an extra argument. I wonder if there's some other reason this
> code is there which is not documented in the comment...

With the following patch, regression tests run fine:

if (indesc->natts == outdesc->natts &&
- indesc->tdtypeid == outdesc->tdtypeid)
+ indesc->tdhasoid != outdesc->tdhasoid)
{

If checking tdtypeid (instead of tdhasoid directly) has some other
consideration, I'd would have seen at least some tests broken by this
change. So, if we are to go with this, I too prefer it over my previous
proposal to add an argument to convert_tuples_by_name(). Attached 0003
implements the above approach.

> Phew. Thanks for all the patches, sorry I'm having trouble keeping up.

Thanks a lot for taking time to look through them and committing.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/92fe2a71-5eb7-ee8d-53ef-cfd5a65dfc3d%40lab.ntt.co.jp

Attachment Content-Type Size
0001-Set-ecxt_scantuple-correctly-for-tuple-routing.patch text/x-diff 4.7 KB
0002-Fix-some-issues-with-views-and-partitioned-tables.patch text/x-diff 5.6 KB
0003-Avoid-tuple-coversion-in-common-partitioning-cases.patch text/x-diff 2.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-01-20 03:14:40 SearchSysCache, SysCacheGetAttr, and heap_getattr()
Previous Message Tom Lane 2017-01-20 02:39:11 Re: Possible issue with expanded object infrastructure on Postgres 9.6.1