Proposal: relaxing link between explicit JOINs and execution order

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org, pgsql-performance(at)postgreSQL(dot)org
Subject: Proposal: relaxing link between explicit JOINs and execution order
Date: 2003-01-22 23:01:53
Message-ID: 25071.1043276513@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

There's been some recent discussion about the fact that Postgres treats
explicit JOIN syntax as constraining the actual join plan, cf
http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/explicit-joins.html

This behavior was originally in there simply because of lack of time to
consider alternatives. I now realize that it wouldn't be hard to get
the planner to do better --- basically, preprocess_jointree just has to
be willing to fold JoinExpr-under-JoinExpr into a FromExpr when the
joins are inner joins.

But in the meantime, some folks have found the present behavior to be
a feature rather than a bug, since it lets them control planning time
on many-table queries. If we are going to change it, I think we need
some way to accommodate both camps.

What I've been toying with is inventing a GUC variable or two. I am
thinking of defining a variable that sets the maximum size of a FromExpr
that preprocess_jointree is allowed to create by folding JoinExprs.
If this were set to 2, the behavior would be the same as before: no
collapsing of JoinExprs can occur. If it were set to a large number,
inner JOIN syntax would never affect the planner at all. In practice
it'd be smart to leave it at some value less than GEQO_THRESHOLD, so
that folding a large number of JOINs wouldn't leave you with a query
that takes a long time to plan or produces unpredictable plans.

There is already a need for a GUC variable to control the existing
behavior of preprocess_jointree: right now, it arbitrarily uses
GEQO_THRESHOLD/2 as the limit for the size of a FromExpr that can be
made by collapsing FromExprs together. This ought to be a separately
settable parameter, I think.

Comments? In particular, can anyone think of pithy names for these
variables? The best I'd been able to come up with is MAX_JOIN_COLLAPSE
and MAX_FROM_COLLAPSE, but neither of these exactly sing...

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-01-22 23:14:30 Re: Terrible performance on wide selects
Previous Message Didier Moens 2003-01-22 22:31:45 Re: Foreign key wierdness

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-01-22 23:14:30 Re: Terrible performance on wide selects
Previous Message Rod Taylor 2003-01-22 22:54:20 Re: Slow query on OS X box