Re: Optimization for updating foreign tables in Postgres FDW

From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimization for updating foreign tables in Postgres FDW
Date: 2014-07-24 09:30:28
Message-ID: CAEZqfEdCznk8yzzxbb8kWw26PZDnFN=_c=UMQv2wx7gMERfA1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Fujita-san,

My coworker Takaaki EITOKU reviewed the patch, and here are some
comments from him.

2014-07-08 16:07 GMT+09:00 Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>:
...
> In the patch postgresPlanForeignModify has been modified so that if, in
> addition to the above condition, the followings are satisfied, then the
> ForeignScan and ModifyTable node will work that way.
>
> - There are no local BEFORE/AFTER triggers.

The reason the check ignores INSTEAD OF triggers here is that INSTEAD
OF trigger would prevent executing UPDATE/DELETE statement against a
foreign tables at all, right?

> - In UPDATE it's safe to evaluate expressions to assign to the target
> columns on the remote server.

IIUC, in addition to target expressions, whole WHERE clause should be
safe to be pushed-down. Though that is obviously required, but
mentioning that in documents and source comments would help users and
developers.

>
> Here is a simple performance test.
>
> On remote side:
> postgres=# create table t (id serial primary key, inserted timestamp
> default clock_timestamp(), data text);
> CREATE TABLE
> postgres=# insert into t(data) select random() from generate_series(0,
> 99999);
> INSERT 0 100000
> postgres=# vacuum t;
> VACUUM
>
> On local side:
> postgres=# create foreign table ft (id integer, inserted timestamp, data
> text) server myserver options (table_name 't');
> CREATE FOREIGN TABLE
>
> Unpatched:
> postgres=# explain analyze verbose delete from ft where id < 10000;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------
> Delete on public.ft (cost=100.00..162.32 rows=910 width=6) (actual
> time=1275.255..1275.255 rows=0 loops=1)
> Remote SQL: DELETE FROM public.t WHERE ctid = $1
> -> Foreign Scan on public.ft (cost=100.00..162.32 rows=910 width=6)
> (actual time=1.180..52.095 rows=9999 loops=1)
> Output: ctid
> Remote SQL: SELECT ctid FROM public.t WHERE ((id < 10000)) FOR
> UPDATE
> Planning time: 0.112 ms
> Execution time: 1275.733 ms
> (7 rows)
>
> Patched (Note that the DELETE command has been pushed down.):
> postgres=# explain analyze verbose delete from ft where id < 10000;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------
> Delete on public.ft (cost=100.00..162.32 rows=910 width=6) (actual
> time=0.006..0.006 rows=0 loops=1)
> -> Foreign Scan on public.ft (cost=100.00..162.32 rows=910 width=6)
> (actual time=0.001..0.001 rows=0 loops=1)
> Output: ctid
> Remote SQL: DELETE FROM public.t WHERE ((id < 10000))
> Planning time: 0.101 ms
> Execution time: 8.808 ms
> (6 rows)

We found that this patch speeds up DELETE case remarkably, as you
describe above, but we saw only less than 2x speed on UPDATE cases.
Do you have any numbers of UPDATE cases?

Some more random thoughts:

* Naming of new behavior
You named this optimization "Direct Update", but I'm not sure that
this is intuitive enough to express this behavior. I would like to
hear opinions of native speakers.

* Macros for # of fdw_private elements
In postgres_fdw.c you defined MaxFdwScanFdwPrivateLength and
MinFdwScanFdwPrivateLength for determining the mode, but number of
fdw_private elements is not a ranged value but an enum value (I mean
that fdw_private holds 3 or 7, not 4~6, so Max and Min seems
inappropriate for prefix. IMO those macros should be named so that
the names represent the behavior, or define a macro to determine the
mode, say IS_SHIPPABLE(foo).

* Comparison of Macros
Comparison against MaxFdwScanFdwPrivateLength and
MinFdwScanFdwPrivateLength should be == instead of >= or <= to detect
unexpected value. Adding assert macro seems good too.

--
Shigeru HANADA

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-07-24 09:38:15 Re: Checkpointer crashes on slave in 9.4 on windows
Previous Message Asif Naeem 2014-07-24 07:59:17 Re: [BUGS] BUG #9652: inet types don't support min/max