Re: Removing INNER JOINs

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Mart Kelder <mart(at)kelder31(dot)nl>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Removing INNER JOINs
Date: 2015-01-14 19:36:21
Message-ID: 54B6C535.4030608@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/13/15 5:02 AM, David Rowley wrote:
>
> I can't quite get my head around what you mean here, as the idea sounds quite similar to something that's been discussed already and ruled out.
> If we're joining relation a to relation b, say the plan chosen is a merge join. If we put some special node as the parent of the merge join then how will we know to skip or not skip any sorts that are there especially for the merge join, or perhaps the planner choose an index scan as a sorted path, where now that the join is removed, could become a faster seqscan. The whole plan switching node discussion came from this exact problem. Nobody seemed to like the non-optimal plan that was not properly optimised for having the relation removed.

In my mind this is the same as a root level Alternative Plan, so you'd be free to do whatever you wanted in the alternate:

-> blah blah
-> Alternate
-> Merge join
...
-> SeqScan

I'm guessing this would be easier to code, but that's just a guess. The other advantage is if you can't eliminate the join to table A at runtime you could still eliminate table B, whereas a top-level Alternate node doesn't have that flexibility.

This does have a disadvantage of creating more plan variations to consider. With a single top-level Alternate node there's only one other option. I believe multiple Alternates would create more options to consider.

Ultimately, unless this is easier to code than a top-level alternate, it's probably not worth pursuing.

> It also seems that transitioning through needless nodes comes at a cost. This is why I quite liked the Alternative Plan node idea, as it allowed me to skip over the alternative plan node at plan initialisation.

For init I would expect this to result in a smaller number of nodes than a top-level Alternate, because you wouldn't be duplicating all the stuff above the joins. That said, I rather doubt it's worth worrying about the cost to init; won't it be completely swamped by extra planning cost, no matter how we go about this?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-01-14 19:45:38 Re: pg_basebackup fails with long tablespace paths
Previous Message Stephen Frost 2015-01-14 19:23:07 Re: INSERT ... ON CONFLICT UPDATE and RLS