Re: Idea about estimating selectivity for single-column expressions

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Idea about estimating selectivity for single-column expressions
Date: 2009-08-19 14:28:44
Message-ID: 407d949e0908190728j4162d267j1cb5e1c0a1f693e6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 19, 2009 at 3:16 PM, Greg Stark<gsstark(at)mit(dot)edu> wrote:
> On Wed, Aug 19, 2009 at 3:53 AM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> * The expression might throw an error for some inputs, for instance
>>        (1 / field) < 0.5
>> which would fail on zero.  We could recover by wrapping the whole
>> estimation process in a subtransaction, but that seems really expensive.
>> I thought about arguing that the failure would happen anyway at runtime,
>> but that doesn't hold water --- for example, the user might have just
>> deleted all the rows with field = 0, and would have grounds to complain
>> if the query failed; but there would still be an entry for zero in the
>> histogram.
>
> We could add another flag in pg_proc for functions which cannot throw
> an error. Perhaps all index operator class operators be required to
> use such functions too?

Another thought. In the above case it would actually be fine to catch
the error with PG_TRY() without a subtransaction. As long as no shared
database state has been modified when the error is thrown then the
subtransaction isn't needed to roll them back.

In theory "immutable" should mean that's true, though there's the
traditional question of whether invisible internal state changes count
as mutations and it would be too fragile to depend on.

What I'm wondering is if we can reliably detect any state changes that
would need to be rolled back. If the PG_CATCH() fires and no state
changes have occurred since the PG_TRY() then we can just continue and
ignore that entry or treat it as false. If any have then we have to
throw an error but if it requires very unusual types of functions then
perhaps that's ok.

I'm not sure if it's sufficient but does the lazy xid assignment
provide a place to hook this in? We could add a counter to indicate
how many times the xid was needed -- if that counter doesn't change
while between the PG_TRY()/PG_CATCH() then we know there's nothing
that would have needed a proper subtransaction to roll back.

As I said I'm not sure that's actually sufficient. Are there other
shared state that subtransactions are needed to roll back? Are there a
small enumerable set or are we going to be constantly discovering new
kinds of state that need to be protected against?

--
greg
http://mit.edu/~gsstark/resume.pdf

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-08-19 14:30:45 Re: Idea about estimating selectivity for single-column expressions
Previous Message Greg Stark 2009-08-19 14:16:45 Re: Idea about estimating selectivity for single-column expressions