Re: WIP: Upper planner pathification

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Upper planner pathification
Date: 2016-03-07 15:37:18
Message-ID: CA+TgmoZrw007BTnsRgg1CLf5=SQ+qLcFiVLup7x0Q-WVn2cy2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 6, 2016 at 10:32 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
>> On Sat, Mar 5, 2016 at 10:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Is there some reason why hash and nestloop are safe but merge isn't?
>
>> I think it is because we consider to pushdown hash and nestloop to workers,
>> but not merge join and the reason for not pushing mergejoin is that
>> currently we don't have executor support for same, more work is needed
>> there.
>
> If that's true, then mergejoin paths ought to be marked parallel-unsafe
> explicitly (with a comment as to why), not just silently reduced to degree
> zero in a manner that looks more like an oversight than anything
> intentional.
>
> I also note that the regression tests pass with this patch and parallel
> mode forced, which seems unlikely if allowing a parallel worker to execute
> a join works for only two out of the three join types. And checking the
> git history for nodeHashjoin.c, nodeHash.c, and nodeNestloop.c shows no
> evidence that any of those files have been touched for parallel query,
> so it's pretty hard to see a reason why those would work in parallel
> queries but nodeMergejoin.c not.
>
> I still say the code as it stands is merely a copy-and-pasteo.

I might call it a thinko rather than a copy-and-pasteo, but basically,
you are right and Amit is wrong. I feel confident making that
statement because I wrote the code, so I think I'm well-positioned to
judge whether I did a particular thing on purpose or not.

The currently-committed code generates paths where nested loops and
hash joins get pushed beneath the Gather node, but does not generate
paths where merge joins have been pushed beneath the Gather node. And
the reason I didn't try to generate those paths is because I believe
they will almost always suck. As of now, what we know how to do is
build a partial path for a join by joining a partial path for the
outer input rel against an ordinary path for the inner rel. That
means that the work of generating the inner rel has to be redone in
each worker. That's not a problem if we've got something like a
nested loop with a parameterized inner index scan, because that sort
of plan redoes all the work for every row anyway. It is a problem for
a hash join, but it's not too hard for it to be worthwhile anyway if
the build table is small. For a merge join, though, it seems rather
unpromising. It's really doubtful that we want each worker to
independently sort the inner rel and then have them join their own
subset of the outer rel against their own copy of the sort. *Maybe*
it could win if the inner path is an index scan, but I wasn't really
sure that would come up and be a win often enough to be worth the cost
of generating the path. We tend to only use merge joins when both of
the relations involved are large, and index-scanning a large relation
tends to lose to sorting it. So it just seemed like a dead end.

Now, if somebody comes along with a patch to create partial merge join
paths and shows that it improves performance on some class of queries
I haven't thought about, I am not going to complain. And in the long
run, I would like to have a facility to partition both relations on
the fly and then have the workers do a merge join per partition.
That's one of the two standard algorithms in the literature for
parallel join - hash join is the other. I'm quite certain that's an
important piece of technology to develop, but it's a huge project unto
itself. My priority for 9.6 is to have a user-visible feature that
takes the infrastructure that we already have as far as it reasonably
can go. Building new infrastructure will have to wait for a future
release.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2016-03-07 16:02:45 Re: Badly designed error reporting code in controldata_utils.c
Previous Message Robert Haas 2016-03-07 15:19:35 Re: Move PinBuffer and UnpinBuffer to atomics