Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Date: 2016-02-04 16:54:52
Message-ID: CAFjFpRcp2Emv8u+jun2_Q=r5wdCigqHNNR8wehnBEK3gD6FU1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 4, 2016 at 2:42 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> > The patch implements your algorithm to deparse a query as described in
> > previous mail. The logic is largely coded in deparseFromExprForRel() and
> > foreign_join_ok(). The later one pulls up the clauses from joining
> relations
> > and first one deparses the FROM clause recursively.
>
> Cool!
>
> + /* Add outer relation. */
> + appendStringInfo(buf, "(%s", join_sql_o.data);
> +
> + /* Add join type */
> + appendStringInfo(buf, " %s JOIN ",
> get_jointype_name(fpinfo->jointype));
> +
> + /* Add inner relation */
> + appendStringInfo(buf, "%s", join_sql_i.data);
> +
> + /* Append ON clause; ON (TRUE) in case empty join clause
> list */
> + appendStringInfoString(buf, " ON ");
>
> Uh, couldn't that all be done as a single appendStringInfo?
>
>
Done.

> It seems a little tortured the way you're passing "relations" all the
> way down the callstack from deparseSelectStmtForRel, and at each level
> it might be NULL. If you made a rule that the caller MUST pass a
> StringInfo, then you could get rid of some conditional logic in
> deparseFromExprForRel. By the way, deparseSelectSql()'s header
> comment could use an update to mention this additional argument.
> Generally, it's helpful to say in each relevant function header
> comment something like "May be NULL" or "Must not be NULL" in cases
> like this to clarify the API contract.
>

Done.

How about building this string when we construct fpinfo? We will waste some
cycles for base relations but we will have lesser arguments in deparsing
routines. I have attached patch recursive_relations.patch implementing this
idea. The patch can be applied on top of the attached patches.

>
> Similarly, I would be inclined to continue to require that
> deparseTargetList() have retrieved_attrs != NULL. If the caller
> doesn't want the value, it can pass a dummy variable and ignore the
> return value. This is of course slightly more cycles, but I think
> it's unlikely to matter, and making the code simpler would be good.
>

Done.

>
> + * Function is the entry point to deparse routines while constructing
> + * ForeignScan plan or estimating cost and size for ForeignPath. It is
> called
> + * recursively to build SELECT statements for joining relations of a
> pushed down
> + * foreign join.
>
> "This function is the entrypoint to the routines, either when
> constructing ForeignScan plan or when estimating" etc.
>

I have removed this comment altogether. The second sentence in the comment
no more holds true as we are not calling this function recursively any
more. The first statement too doesn't add much value, The thing that is
says, was true even before join pushdown and at that time that sentence
wasn't there. The opening comment says what that function does.

>
> + * tuple descriptor for the corresponding foreign scan. For a base
> relation,
> + * which is not part of a pushed down join, fpinfo->attrs_used can be
> used to
> + * construct SELECT clause, thus the function doesn't need tlist. Hence
> when
> + * tlist passed, the function assumes that it's constructing the SELECT
> + * statement to be part of a pushed down foreign join.
>
> I thought you got rid of that assumption. I think it should be gotten
> rid of, and then the comment can go too. If we're keeping the comment
> for some reason, should be "doesn't need the tlist" and when "when the
> tlist is passed".
>

Done. tlist will be used only for join relations. For base relations
fpinfo->attrs_used will be used.

>
> + * 1, since those are the attribute numbers are in the corresponding scan.
>
> Extra "are". Should be: "Those are the attribute numbers in the
> corresponding scan."
>

I don't have that comment in the patch anymore. Probably got removed as
part of the other refactoring.

>
> Would it be nuts to set fdw_scan_tlist in all cases? Would the code
> come out simpler than what we have now?
>

deparesTargetList has an optimization when whole-row reference appears. It
doesn't include whole-row reference and instead includes all the
attributes, so that whole-row reference can be constructed at the time of
projection. We will have to mimic similar logic while creating
fdw_scan_tlist for base relations. Otherwise, we will fetch larger row from
the foreign table. I am still working on this part. Mostly will post it
with the next patch.

In set_foreignscan_references(), we have
1109 if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
If we are to set fdw_scan_tlist for base relation, it would stamp the Vars
with INDEX_VAR which would be undesirable. May be we should just change
that condition to if (fscan->scan.scanrelid == 0). What do you think?

Attaching patches
pg_fdw_core_v8.patch: core changes
pg_fdw_join_v8.patch: postgres_fdw changes for join pushdown
pg_join_pd_v8.patch: combined patch for ease of testing.
recursive_relations.patch: for building relation description while
constructing fpinfo.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
pg_fdw_core_v8.patch text/plain 8.1 KB
pg_fdw_join_v8.patch text/plain 136.1 KB
pg_join_pd_v8.patch text/plain 173.9 KB
recursive_relations.patch text/plain 19.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2016-02-04 16:55:21 Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Previous Message Robert Haas 2016-02-04 16:27:38 Re: Support for N synchronous standby servers - take 2