Re: Planner features, discussion

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pasman pasman'ski <pasman(dot)p(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Planner features, discussion
Date: 2010-07-14 00:47:35
Message-ID: 4C3D0927.6030202@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 13/07/2010 10:52 PM, Greg Smith wrote:

> I heard a scholarly treatment of that topic from Jim Nasby recently,
> where he proposed a boolean GUC to toggle the expanded search behavior
> to be named plan_the_shit_out_of_it.

I was thinking that something like "duplicate subquery/function
elimitation" might be handy, though an extension to WITH would eliminate
the need for it (see below). Consider code like this:

SELECT (SELECT somequery) FROM ...
WHERE (SELECT SOMEQUERY) > somevalue
ORDER BY (SELECT somequery)

that invokes some non-trivial "somequery" several times. I often wanted
to simplify it, and it wasn't always practical to convert it to add
(SELECT somequery) to the join list.

I expected that with 8.4 I'd be able to write something more along the
lines of:

WITH result = (SELECT somequery)
SELECT result FROM ...
WHERE result > somevalue
ORDER BY result;

which makes such an optimization less than necessary. Why complicate the
planner when you can fix your SQL?

However, in the case above the subquery needs to be referenced from a
scalar context not as a join, and WITH expressions don't seem to be
useful for scalar results. The names defined by WITH are only visible as
FROM targets. So this doesn't work:

=> WITH aconstant(constval) AS (VALUES(1)) SELECT x.*, constval FROM
generate_series(1,10) AS x;
ERROR: column "constval" does not exist
LINE 1: ...TH aconstant(constval) AS (VALUES(1)) SELECT x.*, constval F...

nor:

=> WITH aconstant AS (VALUES(1)) SELECT x.*, aconstant FROM
generate_series(1,10) AS x;
ERROR: column "aconstant" does not exist
LINE 1: WITH aconstant AS (VALUES(1)) SELECT x.*, aconstant FROM gen...

... so you're forced to fall back on adding it as an additional join
expression - which isn't always reasonable or possible.

Extending WITH to be useful for defining constants and single-evaluation
variables like the above would be really, really nice, and would avoid
some ugly SQL mangling and any need for compliated planner features that
try to match up and combine subquery trees.

Such a WITH extension wouldn't be of any help where the subqueries
referenced from-list columns, though, so perhaps intelligent combination
of duplicate subqueries would be handy anyway.

I certainly think that combining multiple identical invocations of
stable and immutable functions within a query - ie pre-evaluating the
call and substitutiong the results - would be a desirable feature. It'd
potentially something that could be part of a more aggressively
optimizing planner. It'd be a *LOT* simpler than trying to do the same
thing with subqueries (though deciding when the function arguments are
"the same" might not always be simple) and would be really handy.

Some uses could be solved by extending WITH as above, but not where one
or more function parameters depends on expressions involving fields in
the from-list. Consider:

SELECT expensive_function(sometable.x)
FROM sometable
WHERE expensive_function(sometable.x) > 4;

which can't even be written nicely in a form that only evaluates
expensive_function once. In this case it can be mangled into:

SELECT exf FROM (
SELECT sometable.*, expensive_function(sometable.x) AS exf
FROM sometable
) WHERE exf > 4;

... but in more complicated cases you can't always do that without
landing up evaluating many, many more invocations of
"expensive_function" than you wanted to in the subquery due to
limitations that structure imposes on your filtering.

Am I just missing something obvious? Or might a way to combine multiple
invocations of STABLE / IMMUTABLE functions be a useful thing for an
aggressively optimizing planner to do?

--
Craig Ringer

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2010-07-14 01:08:14 Re: Postgresql 8.4, XPath and name() function
Previous Message Julian Mehnle 2010-07-13 23:08:28 Re: Efficient Way to Merge Two Large Tables