Re: Optimization for updating foreign tables in Postgres FDW

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimization for updating foreign tables in Postgres FDW
Date: 2016-01-20 10:57:52
Message-ID: CAGPqQf3fjterWL5phME_xAb_2RvMJ_MjjGeFLG_jhfdZf9MDdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 15, 2016 at 9:06 AM, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
wrote:

> On 2016/01/14 21:36, Rushabh Lathia wrote:
>
>> On Thu, Jan 14, 2016 at 2:00 PM, Etsuro Fujita
>> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp <mailto:fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>> wrote:
>>
>
> On 2016/01/12 20:31, Rushabh Lathia wrote:
>>
>
> On Thu, Jan 7, 2016 at 6:15 PM, Etsuro Fujita
>> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp
>> <mailto:fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
>> <mailto:fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp
>>
>> <mailto:fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>>> wrote:
>> On 2016/01/06 18:58, Rushabh Lathia wrote:
>> .) What the need of following change ?
>>
>> @@ -833,9 +833,6 @@ appendWhereClause(StringInfo buf,
>> int nestlevel;
>> ListCell *lc;
>>
>> - if (params)
>> - *params = NIL; /* initialize result
>> list to
>> empty */
>> -
>> /* Set up context struct for recursion */
>> context.root = root;
>> context.foreignrel = baserel;
>> @@ -971,6 +968,63 @@ deparseUpdateSql(StringInfo buf,
>> PlannerInfo *root,
>> }
>>
>
> It is needed for deparsePushedDownUpdateSql to store params
>> in both
>> WHERE clauses and expressions to assign to the target columns
>> into one params_list list.
>>
>
> Hmm sorry but I am still not getting the point, can you provide
>> some
>> example to explain this ?
>>
>
> Sorry, maybe my explanation was not enough. Consider:
>>
>> postgres=# create foreign table ft1 (a int, b int) server myserver
>> options (table_name 't1');
>> postgres=# insert into ft1 values (0, 0);
>> postgres=# prepare mt(int, int) as update ft1 set a = $1 where b = $2;
>> postgres=# explain verbose execute mt(1, 0);
>> postgres=# explain verbose execute mt(1, 0);
>> postgres=# explain verbose execute mt(1, 0);
>> postgres=# explain verbose execute mt(1, 0);
>> postgres=# explain verbose execute mt(1, 0);
>>
>> After the 5 executions of mt we have
>>
>> postgres=# explain verbose execute mt(1, 0);
>> QUERY PLAN
>>
>> ------------------------------------------------------------------------------------
>> Update on public.ft1 (cost=100.00..140.35 rows=12 width=10)
>> -> Foreign Update on public.ft1 (cost=100.00..140.35 rows=12
>> width=10)
>> Remote SQL: UPDATE public.t1 SET a = $1::integer WHERE ((b
>> = $2::integer))
>> (3 rows)
>>
>> If we do that initialization in appendWhereClause, we would get a
>> wrong params_list list and a wrong remote pushed-down query for the
>> last mt() in deparsePushedDownUpdateSql.
>>
>
> Strange, I am seeing same behaviour with or without that initialization in
>> appendWhereClause. After the 5 executions of mt I with or without I am
>> getting following output:
>>
>> postgres=# explain verbose execute mt(1, 0);
>> QUERY PLAN
>>
>> ------------------------------------------------------------------------------------
>> Update on public.ft2 (cost=100.00..140.35 rows=12 width=10)
>> -> Foreign Update on public.ft2 (cost=100.00..140.35 rows=12
>> width=10)
>> Remote SQL: UPDATE public.t2 SET a = $1::integer WHERE ((b =
>> $2::integer))
>> (3 rows)
>>
>
> Really? With that initialization in appendWhereClause, I got the
> following wrong result (note that both parameter numbers are $1):
>
> postgres=# explain verbose execute mt(1, 0);
> QUERY PLAN
>
> ------------------------------------------------------------------------------------
> Update on public.ft1 (cost=100.00..140.35 rows=12 width=10)
> -> Foreign Update on public.ft1 (cost=100.00..140.35 rows=12 width=10)
> Remote SQL: UPDATE public.t1 SET a = $1::integer WHERE ((b =
> $1::integer))
> (3 rows)
>
>
Oops sorry. I got the point now.

> BTW, I keep a ForeignScan node pushing down an update to the remote
>> server, in the updated patches. I have to admit that that seems
>> like rather a misnomer. So, it might be worth adding a new
>> ForeignUpdate node, but my concern about that is that if doing so,
>> we would have a lot of duplicate code in ForeignUpdate and
>> ForeignScan. What do you think about that?
>>
>
> Yes, I noticed that in the patch and I was about to point that out in my
>> final review. As first review I was mainly focused on the functionality
>> testing
>> and other overview things. Another reason I haven't posted that in my
>> first review round is, I was not quite sure whether we need the
>> separate new node ForeignUpdate, ForeignDelete and want to duplicate
>> code? Was also not quite sure about the fact that what we will achieve
>> by doing that.
>>
>> So I thought, I will have this open question in my final review comment,
>> and will take committer's opinion on this. Since you already raised this
>> question lets take others opinion on this.
>>
>
> OK, let's do that.
>
>
Overall I am quite done with the review of this patch. Patch is in good
shape and covered most of the things which been discussed earlier
or been mentioned during review process. Patch pass through the
make check and also includes good test coverage.

Here are couple of things which is still open for discussion:

1)

.) When Tom Lane and Stephen Frost suggested getting the core code involved,
>> I thought that we can do the mandatory checks into core it self and making
>> completely out of dml_is_pushdown_safe(). Please correct me
>>
>
> The reason why I put that function in postgres_fdw.c is Check point 4:
>
> + * 4. We can't push an UPDATE down, if any expressions to assign to the
> target
> + * columns are unsafe to evaluate on the remote server.
>
>
Here I was talking about checks related to triggers, or to LIMIT. I think
earlier thread talked about those mandatory check to the core. So may
be we can move those checks into make_modifytable() before calling
the PlanDMLPushdown.

This need to handle by the Owner.

2) Decision on whether we need the separate new node ForeignUpdate,
ForeignDelete. In my opinion I really don't see the need of this as we
that will add lot of duplicate. Having said that if committer or someone
else feel like that will make code more clean that is also true,

This need more comments from the committer.

Thanks

Rushabh Lathia
www.EnterpriseDB.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-01-20 11:54:26 Re: dynloader.h missing in prebuilt package for Windows?
Previous Message Andres Freund 2016-01-20 10:13:26 Re: checkpointer continuous flushing