Re: BUG #5611: SQL Function STABLE promoting to VOLATILE

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Brian Ceccarelli <bceccarelli(at)net32(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #5611: SQL Function STABLE promoting to VOLATILE
Date: 2010-08-12 16:57:15
Message-ID: AANLkTinCUYn-_i17a_GrGWyyc8x-j9QE6LqmYnfE22RW@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Aug 12, 2010 at 11:15 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'm not exactly following this.  My guess is that the breakeven point
>> is going to be pretty low because I think Param nodes are pretty
>> cheap.
>
> If you have any significant number of executions of the expression, then
> of course converting it to an initplan is a win.  What I'm worried about
> is where you have just a small number (like maybe only one, if it gets
> converted to an indexqual for instance).  Then the added expense of
> setting up the initplan isn't going to be repaid.  As long as the
> expression is pretty expensive, the percentage overhead of wrapping it
> in an initplan will probably be tolerable anyway, but I'm not sure where
> the threshold of "pretty expensive" is for that.

Oh, I see. It seems a lot more elegant if we can start by determining
whether the expression is in a context where it's likely to be
executed more than once.

*thinks*

I wonder if we could make this decision mostly based on node types.
Maybe it's reasonable to say that if we're doing say, a Seq Scan, we
always assume it's going to get evaluated more than once. Yeah, the
table could have <2 rows in it, but mostly it won't, and I don't know
that it's wise to optimize for that case even if we *think* that's
what the statistics are telling us. Similarly for a Function Scan,
CTE Scan, Worktable Scan, etc. We *could* look at the row estimates,
but I bet it isn't necessary. On the other hand, for an index qual,
it's probably pointless. I guess the hard case is a filter qual on an
index scan... it's not too clear to me what the right thing to do is
in that case.

>> Well, that's certainly a good place to start, but I was thinking that
>> it would be nice to optimize things like this:
>
>> SELECT * FROM foo WHERE somecolumn = somefunc();
>
>> This is OK if we choose a straight index scan, but it's probably very
>> much worth optimizing if we end up doing anything else.  If that's too
>> hairy, then maybe not, but it's not obvious to me why it would be
>> expensive.
>
> Because you have to look at every subexpression of every subexpression
> to figure out if it's (a) stable and (b) expensive.  Each of those
> checks is un-cheap in itself, and if you blindly apply them at every
> node of an expression tree the cost will be exponential.

I think you'd need some kind of expression tree walker that builds up
a list of maximal stable subexpression trees. It would be nice to
figure this out at some point in the process where we already have to
check volatility anyway.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2010-08-12 18:26:51 Re: BUG #5614: Varchar column (with DEFAULT NULL) stores 'UL' value instead of null
Previous Message Tom Lane 2010-08-12 15:15:36 Re: BUG #5611: SQL Function STABLE promoting to VOLATILE