Re: partition routing layering in nodeModifyTable.c

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, amitdkhan(dot)pg(at)gmail(dot)com
Subject: Re: partition routing layering in nodeModifyTable.c
Date: 2019-08-01 09:38:09
Message-ID: CAPmGK15=oFHmWNND5vopfokSGfn6jMXVvnHa7K7P49F7k1hWPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit-san,

On Thu, Aug 1, 2019 at 10:33 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Wed, Jul 31, 2019 at 9:04 PM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote:
> > On Wed, Jul 31, 2019 at 5:05 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > On Tue, Jul 30, 2019 at 4:20 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > > On Sat, Jul 20, 2019 at 1:52 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > > > IOW, we should just change the direct modify calls to get the relevant
> > > > > ResultRelationInfo or something in that vein (perhaps just the relevant
> > > > > RT index?).
> > > >
> > > > It seems easy to make one of the two functions that constitute the
> > > > direct modify API, IterateDirectModify(), access the result relation
> > > > from ForeignScanState by saving either the result relation RT index or
> > > > ResultRelInfo pointer itself into the ForeignScanState's FDW-private
> > > > area. For example, for postgres_fdw, one would simply add a new
> > > > member to PgFdwDirectModifyState struct.
> > > >
> > > > Doing that for the other function BeginDirectModify() seems a bit more
> > > > involved. We could add a new field to ForeignScan, say
> > > > resultRelation, that's set by either PlanDirectModify() (the FDW code)
> > > > or make_modifytable() (the core code) if the ForeignScan node contains
> > > > the command for direct modification. BeginDirectModify() can then use
> > > > that value instead of relying on es_result_relation_info being set.
> > > >
> > > > Thoughts? Fujita-san, do you have any opinion on whether that would
> > > > be a good idea?
> >
> > I'm still not sure that it's a good idea to remove
> > es_result_relation_info, but if I had to say then I think the latter
> > would probably be better.
>
> Could you please clarify what you meant by the "latter"?
>
> If it's the approach of adding a resultRelation Index field to
> ForeignScan node, I tried and had to give up, realizing that we don't
> maintain ResultRelInfos in an array that is indexable by RT indexes.
> It would've worked if es_result_relations had mirrored es_range_table,
> although that probably complicates how the individual ModifyTable
> nodes attach to that array. In any case, given this discussion,
> further hacking on a global variable like es_result_relations may be a
> course we might not want to pursue.

Yeah, I mean that approach. To get the ResultRelInfo, I think
searching through the es_result_relations for the ResultRelInfo for
the resultRelation added to the ForeignScan in BeginDirectModify()
like the attached, which is created along your proposal.
ExecFindResultRelInfo() added by the patch wouldn't work efficiently
for inherited UPDATE/DELETE where there are many children that are
foreign tables, but I think that would probably be OK because in most
use-cases, including sharding, the number of such children would be at
most < 100 or so. For improving the efficiency for the cases where
there are a lot more such children, however, I think it would be an
option to do something about global variables so that we can access
the ResultRelInfos by RT indexes more efficiently, because IMO I don't
think that would be against the point here ie, removing the dependency
on es_result_relation_info. Maybe I'm missing something, though.

> > I'm planning to rework on direct
> > modification to base it on upper planner pathification so we can
> > perform direct modification without the ModifyTable node. (I'm not
> > sure we can really do this for inherited UPDATE/DELETE, though.) For
> > that rewrite, I'm thinking to call BeginDirectModify() from the
> > ForeignScan node (ie, ExecInitForeignScan()) as-is. The latter
> > approach would allow that without any changes and avoid changing that
> > API many times. That's the reason why I think the latter would
> > probably be better.
>
> Will the new planning approach you're thinking of get rid of needing
> any result relations at all (and so the ResultRelInfos in the
> executor)?

I think the new planning approach would still need result relations
and ResultRelInfos in the executor as-is; and the FDW would probably
use the ResultRelInfo for the foreign table created by the core. Some
of the ResultRelInfo data would prpbably need to be initialized by the
FDW itesef, though (eg, WCO constraints and/or RETURNING if any).

Best regards,
Etsuro Fujita

Attachment Content-Type Size
postgres-fdw-dont-depend-on-es_result_relation_info.patch application/octet-stream 7.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-08-01 09:39:53 Re: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead
Previous Message Fabien COELHO 2019-08-01 09:34:34 Re: refactoring - share str2*int64 functions