Re: [HACKERS] Add support for tuple routing to foreign partitions

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Maksim Milyutin <milyutinma(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Add support for tuple routing to foreign partitions
Date: 2018-02-22 02:52:07
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2018/02/21 20:54, Etsuro Fujita wrote:
> (2018/02/02 19:33), Etsuro Fujita wrote:
>> (2018/01/25 23:33), Stephen Frost wrote:
>>> I'm afraid a good bit of this patch is now failing to apply. I don't
>>> have much else to say except to echo the performance concern that Amit
>>> Langote raised about expanding the inheritence tree twice.
>> To address that concern, I'm thinking to redesign the patch so that it
>> wouldn't expand the tree at planning time anymore. I don't have any
>> clear solution for that yet, but what I have in mind now is to add new
>> FDW APIs to the executor, instead, so that the FDW could 1) create stuff
>> such as a query for remote INSERT as PlanForeignModify and 2)
>> initialize/end the remote INSERT operation as BeginForeignModify and
>> EndForeignModify, somewhere in the executor.
> New FDW APIs I would like to propose for that are:

Thanks for updating the proposal.

> void
> BeginForeignRouting(ModifyTableState *mtstate,
>                     ResultRelInfo *resultRelInfo,
>                     int partition_index);
> Prepare for a tuple-routing operation on a foreign table.  This is called
> from ExecSetupPartitionTupleRouting and ExecInitPartitionInfo.

I wonder why partition_index needs to be made part of this API?

> TupleTableSlot *
> ExecForeignRouting(EState *estate,
>                    ResultRelInfo *resultRelInfo,
>                    TupleTableSlot *slot);
> Route one tuple to the foreign table.  This is called from ExecInsert.
> void
> EndForeignRouting(EState *estate,
>                   ResultRelInfo *resultRelInfo);
> End the operation and release resources.  This is called from
> ExecCleanupTupleRouting.
> Attached are WIP patches for that:
> Patch postgres-fdw-refactoring-WIP.patch: refactoring patch for
> postgres_fdw.c to reduce duplicate code.
> Patch foreign-routing-fdwapi-WIP.patch: main patch to add new FDW APIs,
> which is created on top of patch postgres-fdw-refactoring-WIP.patch and
> the lazy-initialization-of-partition-info patch [1].

Noticed a typo in the patch (s/parition/partition/g):

+ * Also let the FDW init itself if this parition is foreign.

+ * Also let the FDW init itself if this parition is foreign.

> By this change we don't need to expand the inheritance tree at planning
> time, so no need to worry about the performance concern.  Maybe I'm
> missing something, though.  Early feedback would be greatly appreciated.

Perhaps an independent concern, but one thing I noticed is that it does
not seem to play well with the direct modification (update push-down)
feature. Now because updates (at least local, let's say) support
re-routing, I thought we'd be able move rows across servers via the local
server, but with direct modification we'd never get the chance. However,
since update tuple routing is triggered by partition constraint failure,
which we don't enforce for foreign tables/partitions anyway, I'm not sure
if we need to do anything about that, and even if we did, whether it
concerns this proposal.

That said, I saw in the changes to ExecSetupPartitionTupleRouting() that
BeginForeignRouting() is called for a foreign partition even if direct
modification might already have been set up. If direct modification is
set up, then ExecForeignRouting() will never get called, because we'd
never call ExecUpdate() or ExecInsert().


In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2018-02-22 03:25:17 Re: [bug fix] Produce a crash dump before main() on Windows
Previous Message Tomas Vondra 2018-02-22 02:37:44 Re: Hash Joins vs. Bloom Filters / take 2