Re: [HACKERS] Add support for tuple routing to foreign partitions

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Stephen Frost <sfrost(at)snowman(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Maksim Milyutin <milyutinma(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Add support for tuple routing to foreign partitions
Date: 2018-02-23 07:38:24
Message-ID: CA+HiwqH--NTNNz_UO733+X_kiiE1yBwwavj3DjKpFEBzHDeySA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujita-san,

On Thu, Feb 22, 2018 at 8:49 PM, Etsuro Fujita
<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> (2018/02/22 11:52), Amit Langote wrote:
>> I wonder why partition_index needs to be made part of this API?
>
> The reason for that is because I think the FDW might want to look at the
> partition info stored in mtstate->mt_partition_tuple_routing for some reason
> or other, such as parent_child_tupconv_maps and child_parent_tupconv_maps,
> which are accessed with the given partition index.

I see. I guess that makes sense.

>> Perhaps an independent concern, but one thing I noticed is that it does
>> not seem to play well with the direct modification (update push-down)
>> feature. Now because updates (at least local, let's say) support
>> re-routing, I thought we'd be able move rows across servers via the local
>> server, but with direct modification we'd never get the chance. However,
>> since update tuple routing is triggered by partition constraint failure,
>> which we don't enforce for foreign tables/partitions anyway, I'm not sure
>> if we need to do anything about that, and even if we did, whether it
>> concerns this proposal.
>
> Good point! Actually, update tuple routing we have in HEAD doesn't allow
> re-routing tuples from foreign partitions even without direct modification.
> Here is an example using postgres_fdw:
>
> postgres=# create table pt (a int, b text) partition by list (a);
> CREATE TABLE
> postgres=# create table loct (a int check (a in (1)), b text);
> CREATE TABLE
> postgres=# create foreign table remp partition of pt for values in (1)
> server loopback options (table_name 'loct');
> CREATE FOREIGN TABLE
> postgres=# create table locp partition of pt for values in (2);
> CREATE TABLE
> postgres=# insert into remp values (1, 'foo');
> INSERT 0 1
> postgres=# insert into locp values (2, 'bar');
> INSERT 0 1
> postgres=# select tableoid::regclass, * from pt;
> tableoid | a | b
> ----------+---+-----
> remp | 1 | foo
> locp | 2 | bar
> (2 rows)
>
> postgres=# create function postgres_fdw_abs(int) returns int as $$begin
> return abs($1); end$$ language plpgsql immutable;
> CREATE FUNCTION
> postgres=# explain verbose update pt set a = postgres_fdw_abs(a) + 1 where b
> = 'foo';
> QUERY PLAN
>
> --------------------------------------------------------------------------------
> -------------
> Update on public.pt (cost=100.00..154.54 rows=12 width=42)
> Foreign Update on public.remp
> Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1
> Update on public.locp
> -> Foreign Scan on public.remp (cost=100.00..127.15 rows=6 width=42)
> Output: (postgres_fdw_abs(remp.a) + 1), remp.b, remp.ctid
> Remote SQL: SELECT a, b, ctid FROM public.loct WHERE ((b =
> 'foo'::text)) FOR UPDATE
> -> Seq Scan on public.locp (cost=0.00..27.39 rows=6 width=42)
> Output: (postgres_fdw_abs(locp.a) + 1), locp.b, locp.ctid
> Filter: (locp.b = 'foo'::text)
> (10 rows)
>
> postgres=# update pt set a = postgres_fdw_abs(a) + 1 where b = 'foo';
> ERROR: new row for relation "loct" violates check constraint "loct_a_check"
> DETAIL: Failing row contains (2, foo).
> CONTEXT: Remote SQL command: UPDATE public.loct SET a = $2 WHERE ctid = $1
>
> To be able to move tuples across foreign servers, I think we would first
> have to do something to allow re-routing tuples from foreign partitions.
> The patches I proposed hasn't done anything about that, so the patched
> version would behave the same way as HEAD with/without direct modification.

Yes.

As I said, since update re-routing is triggered by partition
constraint failure for the new row and we don't check (any)
constraints for foreign tables, that means re-routing won't occur for
foreign partitions.

>> That said, I saw in the changes to ExecSetupPartitionTupleRouting() that
>> BeginForeignRouting() is called for a foreign partition even if direct
>> modification might already have been set up. If direct modification is
>> set up, then ExecForeignRouting() will never get called, because we'd
>> never call ExecUpdate() or ExecInsert().
>
> Yeah, but I am thinking to leave the support for re-routing tuples across
> foreign servers for another patch.
>
> One difference between HEAD and the patched version would be: we can
> re-route tuples from a plain partition to a foreign partition if the foreign
> partition supports this tuple-routing API. Here is an example using the
> above data:
>
> postgres=# select tableoid::regclass, * from pt;
> tableoid | a | b
> ----------+---+-----
> remp | 1 | foo
> locp | 2 | bar
> (2 rows)
>
> postgres=# update pt set a = 1 where b = 'bar';
> UPDATE 1
> postgres=# select tableoid::regclass, * from pt;
> tableoid | a | b
> ----------+---+-----
> remp | 1 | foo
> remp | 1 | bar
> (2 rows)
>
> This would introduce an asymmetry (we can move tuples from plain partitions
> to foreign partitions, but the reverse is not true), but I am thinking that
> it would be probably okay to document about that.

I see. Thanks for the explanation.

About just documenting the asymmetry you mentioned that's caused by
the fact that we don't enforce constraints on foreign tables, I
started wondering if we shouldn't change our stance on the matter wrt
"partition" constraints? But, admittedly, that's a topic for a
different thread.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-02-23 08:05:11 Re: RTLD_GLOBAL (& JIT inlining)
Previous Message John Naylor 2018-02-23 07:38:02 Re: WIP: a way forward on bootstrap data