Re: Yet another abort-early plan disaster on 9.3

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Yet another abort-early plan disaster on 9.3
Date: 2014-09-29 23:00:37
Message-ID: 13982.1412031637@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> The way I'm seeing it, you can't assume the LIMIT will apply to any
> IndexScan that doesn't have an index condition. If it has just a
> filter, or nothing at all, just an ordering then it could easily scan
> the whole index if the stats are wrong.

That statement applies with equal force to *any* plan with a LIMIT;
it's not just index scans.

The real question is to what extent are the tuples satisfying the extra
filter condition randomly distributed with respect to the index order
(or physical order, if it's a seqscan). The existing cost estimation
code effectively assumes that they're perfectly uniformly distributed;
which is a good average-case assumption but can be horribly wrong in
the worst case.

If we could settle on some other model for the probable distribution
of the matching tuples, we could adjust the cost estimates for LIMIT
accordingly. I have not enough statistics background to know what a
realistic alternative would be.

Another possibility is to still assume a uniform distribution but estimate
for, say, a 90% probability instead of 50% probability that we'll find
enough tuples after scanning X amount of the table. Again, I'm not too
sure what that translates to in terms of the actual math, but it sounds
like something a statistics person could do in their sleep.

I do not think we should estimate for the worst case though. If we do,
we'll hear cries of anguish from a lot of people, including many of the
same ones complaining now, because the planner stopped picking fast-start
plans even for cases where they are orders of magnitude faster than the
alternatives.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-09-29 23:09:27 Re: test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)
Previous Message Peter Geoghegan 2014-09-29 22:21:30 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

Browse pgsql-performance by date

  From Date Subject
Next Message Gavin Flower 2014-09-30 02:12:00 Re: Yet another abort-early plan disaster on 9.3
Previous Message Josh Berkus 2014-09-29 21:54:31 Re: Yet another abort-early plan disaster on 9.3