Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Date: 2016-02-02 16:18:12
Message-ID: CAFjFpRcPaf128gNjPpngn7hFKUurWe57n1q8oiLrqagQrCzZKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 2, 2016 at 5:18 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Mon, Feb 1, 2016 at 8:27 AM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> > Here are patches rebased on recent commit
> > cc592c48c58d9c1920f8e2063756dcbcce79e4dd. Renamed original
> deparseSelectSql
> > as deparseSelectSqlForBaseRel and added deparseSelectSqlForJoinRel to
> > construct SELECT and FROM clauses for base and join relations.
> >
> > pg_fdw_core_v5.patch GetUserMappingById changes
> > pg_fdw_join_v5.patch: postgres_fdw changes for join pushdown including
> > suggestions as described below
> > pg_join_pd_v5.patch: combined patch for ease of testing.
> >
> > The patches also have following changes along with the changes described
> in
> > my last mail.
> > 1. Revised the way targetlists are handled. For a bare base relation the
> > SELECT clause is deparsed from fpinfo->attrs_used but for a base relation
> > which is part of join relation, the expected targetlist is passed down to
> > deparseSelectSqlForBaseRel(). This change removed 75 odd lines in
> > build_tlist_to_deparse() which were very similar to
> > deparseTargetListFromAttrsUsed() in the previous patch.
>
> Nice!
>
> > 2. Refactored postgresGetForeignJoinPaths to be more readable moving the
> > code to assess safety of join pushdown into a separate function.
>
> That looks good. But maybe call the function foreign_join_ok() or
> something like that? is_foreign_join() isn't terrible but it sounds a
> little odd to me.
>

I used name is_foreign_join(), which assesses push-down safety of a join,
to have similar naming convention with is_foreign_expr(), which checks
push-down safety of an expression. But foreign_join_ok() is fine too. Used
that.

>
> The path-copying stuff in get_path_for_epq_recheck() looks a lot
> better now, but you neglected to add a comment explaining why you did
> it that way (e.g. "Make a shallow copy of the join path, because the
> planner might free the original structure after a future add_path().
> We don't need to copy the substructure, though; that won't get freed."
>

I alluded to that in the second sentence of comment
3259 * Since we will need to replace any foreign paths for join with their
alternate
3260 * paths, we need make a copy of the local path chosen. Also, that
helps in case
3261 * the planner chooses to throw away the local path.

But that wasn't as clear as your wording. Rewrote the paragraph using your
wording.
3259 * Since we will need to replace any foreign paths for join with their
alternate
3260 * paths, we need make a copy of the local path chosen. Make a shallow
copy of
3261 * the join path, because the planner might free the original
structure after a
3262 * future add_path(). We don't need to copy the substructure, though;
that won't
3263 * get freed.

> I would forget about setting merge_path->materialize_inner = false;
> that doesn't seem essential.

Done.

> Also, I would arrange things so that if
> you hit an unrecognized path type (like a custom join, or a gather)
> you skip that particular path instead of erroring out.

Ok. Done.

> I think this
> whole function should be moved to core,

I have moved the function to foreign.c where most of the FDW APIs are
located and declared it in fdwapi.h. Since the function deals with the
paths, I thought of adding it to some path related file, but since it's a
helper function that an FDW can use, I thought foreign.c would be better. I
have also added documentation in fdwhandler.sgml. I have renamed the
function as GetPathForEPQRecheck() in order to be consistent with other FDW
APIs. In the description I have just mentioned copy of a local path. I am
not sure whether we should say "shallow copy".

> and I think the argument
> should be a RelOptInfo * rather than a List *.
>

Done.

>
> + * We can't know VERBOSE option is specified or not, so always add
> shcema
>
> We can't know "whether" VERBOSE...
> shcema -> schema
>
>
Done.

> + * the join relaiton is already considered, so that we won't waste
> time in
>
> Typo.
>
>
Done.

> + * judging safety of join pushdow and adding the same paths again if
> found
>
> Typo.
>
> Done.

Sorry for those typos.

Attaching patches with reply to your next mail.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2016-02-02 16:21:38 Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Previous Message Masahiko Sawada 2016-02-02 16:16:54 Re: Freeze avoidance of very large table.