Re: Optimization for updating foreign tables in Postgres FDW

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimization for updating foreign tables in Postgres FDW
Date: 2016-01-06 09:58:29
Message-ID: CAGPqQf0GLpwFnCGKrxzJ3OKZdUqF+T9qi_-XHyN2uy2Htou97w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I started looking at updated patch and its definitely iked the new
approach.

Patch passing the mandatory checks:

1) Patch applied cleanly using patch -p1 command
2) Source got compiled cleanly
3) make check, running cleanly

Explain output for the remote table and inheritance relations looks better
then earlier version of patch.

-- Inheritance foreign relation
postgres=# explain (ANALYZE, VERBOSE) update fp set a = 20;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------
Update on public.fp (cost=100.00..767.60 rows=10920 width=6) (actual
time=1.000..1.000 rows=0 loops=1)
Foreign Update on public.fp
Foreign Update on public.fc1
Foreign Update on public.fc2
Foreign Update on public.fc3
-> Foreign Update on public.fp (cost=100.00..191.90 rows=2730 width=6)
(actual time=0.493..0.493 rows=0 loops=1)
Remote SQL: UPDATE public.p SET a = 20
-> Foreign Update on public.fc1 (cost=100.00..191.90 rows=2730
width=6) (actual time=0.177..0.177 rows=0 loops=1)
Remote SQL: UPDATE public.c1 SET a = 20
-> Foreign Update on public.fc2 (cost=100.00..191.90 rows=2730
width=6) (actual time=0.163..0.163 rows=0 loops=1)
Remote SQL: UPDATE public.c2 SET a = 20
-> Foreign Update on public.fc3 (cost=100.00..191.90 rows=2730
width=6) (actual time=0.158..0.158 rows=0 loops=1)
Remote SQL: UPDATE public.c3 SET a = 20
Planning time: 0.228 ms
Execution time: 1.359 ms
(15 rows)

-- Foreign table update
postgres=# explain (ANALYZE, VERBOSE) update ft1 set c8 = 'updated' where
c1 = '200';
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------
Update on public.ft1 (cost=100.00..116.13 rows=2 width=144) (actual
time=0.485..0.485 rows=0 loops=1)
-> Foreign Update on public.ft1 (cost=100.00..116.13 rows=2 width=144)
(actual time=0.483..0.483 rows=0 loops=1)
Remote SQL: UPDATE public.t1 SET c8 = 'updated '::character(10)
WHERE ((c1 = 200))
Planning time: 0.158 ms
Execution time: 0.786 ms
(5 rows)

-- Explain output for the returning clause:
postgres=# explain (ANALYZE, VERBOSE) update ft1 set c8 = 'updated' where
c1 = '200' returning c2;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------
Update on public.ft1 (cost=100.00..116.13 rows=2 width=144) (actual
time=0.516..0.516 rows=0 loops=1)
Output: c2
-> Foreign Update on public.ft1 (cost=100.00..116.13 rows=2 width=144)
(actual time=0.514..0.514 rows=0 loops=1)
Remote SQL: UPDATE public.t1 SET c8 = 'updated '::character(10)
WHERE ((c1 = 200)) RETURNING c2
Planning time: 0.172 ms
Execution time: 0.938 ms
(6 rows)

-- Explain output when returning clause is not pushdown safe:
postgres=# explain (ANALYZE, VERBOSE) update ft1 set c8 = 'updated' where
c1 = '200' returning local_func(20);
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------
Update on public.ft1 (cost=100.00..116.13 rows=2 width=144) (actual
time=0.364..0.364 rows=0 loops=1)
Output: local_func(20)
-> Foreign Update on public.ft1 (cost=100.00..116.13 rows=2 width=144)
(actual time=0.363..0.363 rows=0 loops=1)
Remote SQL: UPDATE public.t1 SET c8 = 'updated '::character(10)
WHERE ((c1 = 200))
Planning time: 0.142 ms
Execution time: 0.623 ms
(6 rows)

-- Explain output with PREPARE:
postgres=# explain (ANALYZE, VERBOSE) execute ftupdate(20);
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------
Update on public.ft1 (cost=100.00..116.13 rows=2 width=144) (actual
time=0.712..0.712 rows=0 loops=1)
-> Foreign Update on public.ft1 (cost=100.00..116.13 rows=2 width=144)
(actual time=0.711..0.711 rows=1 loops=1)
Remote SQL: UPDATE public.t1 SET c8 = 'updated '::character(10)
WHERE ((c1 = 20))
Execution time: 1.001 ms
(4 rows)

With the initial look and test overall things looking great, I am still
reviewing the code changes but here are few early doubts/questions:

.) What the need of following change ?

@@ -833,9 +833,6 @@ appendWhereClause(StringInfo buf,
int nestlevel;
ListCell *lc;

- if (params)
- *params = NIL; /* initialize result list to empty */
-
/* Set up context struct for recursion */
context.root = root;
context.foreignrel = baserel;
@@ -971,6 +968,63 @@ deparseUpdateSql(StringInfo buf, PlannerInfo *root,
}

.) When Tom Lane and Stephen Frost suggested getting the core code involved,
I thought that we can do the mandatory checks into core it self and making
completely out of dml_is_pushdown_safe(). Please correct me

.) Documentation for the new API is missing (fdw-callbacks).

Regards,

On Fri, Dec 25, 2015 at 3:30 PM, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
wrote:

> On 2015/12/24 4:34, Robert Haas wrote:
>
>> On Wed, Dec 23, 2015 at 5:50 AM, Rushabh Lathia
>> <rushabh(dot)lathia(at)gmail(dot)com> wrote:
>>
>>> +1.
>>>
>>> I like idea of separate FDW API for the DML Pushdown. Was thinking can't
>>> we
>>> can re-use the IterateForeignScan(ForeignScanState *node) rather then
>>> introducing IterateDMLPushdown(ForeignScanState *node) new API ?
>>>
>>
> Yeah, I think we need to ask ourselves what advantage we're getting
>> out of adding any new core APIs. Marking the scan as a pushed-down
>> update or delete has some benefit in terms of making the information
>> visible via EXPLAIN, but even that's a pretty thin benefit. The
>> iterate method seems to just complicate the core code without any
>> benefit at all. More generally, there is very, very little code in
>> this patch that accomplishes anything that could not be done just as
>> well with the existing methods. So why are we even doing these core
>> changes?
>>
>
> From the FDWs' point of view, ISTM that what FDWs have to do for
> IterateDMLPushdown is quite different from what FDWs have to do for
> IterateForeignScan; eg, IterateDMLPushdown must work in accordance with
> presence/absence of a RETURNING list. (In addition to that,
> IterateDMLPushdown has been designed so that it must make the scan tuple
> available to later RETURNING projection in nodeModifyTable.c.) So, I think
> that it's better to FDWs to add separate APIs for the DML pushdown, making
> the FDW code much simpler. So based on that idea, I added the postgres_fdw
> changes to the patch. Attached is an updated version of the patch, which
> is still WIP, but I'd be happy if I could get any feedback.
>
> Tom seemed to think that we could centralize some checks in the core
>> code, say, related to triggers, or to LIMIT. But there's nothing like
>> that in this patch, so I'm not really understanding the point.
>>
>
> For the trigger check, I added relation_has_row_level_triggers. I placed
> that function in postgres_fdw.c in the updated patch, but I think that by
> placing that function in the core, FDWs can share that function. As for
> the LIMIT, I'm not sure we can do something about that.
>
> I think the current design allows us to handle a pushed-down update on a
> join, ie, "UPDATE foo ... FROM bar ..." where both foo and bar are remote,
> which was Tom's concern, but I'll leave that for another patch. Also, I
> think the current design could also extend to push down INSERT .. RETURNING
> .., but I'd like to leave that for future work.
>
> I'll add this to the next CF.
>
> Best regards,
> Etsuro Fujita
>

Rushabh Lathia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shulgin, Oleksandr 2016-01-06 10:38:04 Re: Inconsistent error handling in START_REPLICATION command
Previous Message Amit Langote 2016-01-06 09:57:32 Re: Regression caused by recent change to initdb?