Re: Yet another abort-early plan disaster on 9.3

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Yet another abort-early plan disaster on 9.3
Date: 2014-09-29 21:54:31
Message-ID: 5429D517.3090900@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 09/26/2014 01:06 AM, Simon Riggs wrote:
> On 23 September 2014 00:56, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
>> We've hashed that out a bit, but frankly I think it's much more
>> profitable to pursue fixing the actual problem than providing a
>> workaround like "risk", such as:
>>
>> a) fixing n_distinct estimation
>> b) estimating stacked quals using better math (i.e. not assuming total
>> randomness)
>> c) developing some kind of correlation stats
>>
>> Otherwise we would be just providing users with another knob there's no
>> rational way to set.
>
> I believe this is a serious issue for PostgreSQL users and one that
> needs to be addressed.
>
> n_distinct can be fixed manually, so that is less of an issue.

It's an issue for the 99.8% of our users who don't know what n_distinct
is, let alone how to calculate it. Also, changing it requires an
exclusive lock on the table. Of course, you and I have been over this
issue before.

One thing I'm wondering is why our estimator is creates n_distinct as a
% so seldom. Really, any time n_distinct is over 10K we should be
estimating a % instead. Now, estimating that % has its own issues, but
it does seem like a peculiar quirk of our stats model.

Anyway, in the particular case I posted fixing n_distinct to realistic
numbers (%) fixed the query plan.

>
> The problem, as I see it, is different. We assume that if there are
> 100 distinct values and you use LIMIT 1 that you would only need to
> scan 1% of rows. We assume that the data is arranged in the table in a
> very homogenous layout. When data is not, and it seldom is, we get
> problems.
>
> Simply put, assuming that LIMIT will reduce the size of all scans is
> just way wrong. I've seen many plans where increasing the LIMIT
> dramatically improves the plan.
>
> If we can at least agree it is a problem, we can try to move forwards.

That is certainly another problem. Does correlation stat figure in the
LIMIT calculation at all, currently? That's what correlation stat is
for, no?

Also, to be fair, physical correlation of rows can also lead to
abort-early plans being extra fast, if everything we want is towards the
beginning of the index. Which means we'd need working multi-column
correlation, which is a known hard problem.

For example, consider the query:

SELECT id, updated_on FROM audit_log
WHERE updated_on < '2010-01-01'
ORDER BY id LIMIT 10;

In an append-only table, that query is liable to be very fast with an
abort-early plan scanning on an ID index (AEP from now on), since the
oldest rows are likely to correspond with the smallest IDs. But then
the user does this:

SELECT id, updated_on FROM audit_log
WHERE updated_on < '2010-01-01'
ORDER BY id DESC LIMIT 10;

... and a completely different plan is called for, because using an AEP
will result in reverse scanning most of the index. However, I'm not
sure the planner knows the difference, since it's only comparing the
estimated selectivity of (updated_on < '2010-01-01') and seeing that
it's 20% of rows. I bet you'd get an AEP in the 2nd case too.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2014-09-29 21:57:45 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Previous Message Andres Freund 2014-09-29 21:49:40 Re: test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2014-09-29 23:00:37 Re: Yet another abort-early plan disaster on 9.3
Previous Message Simon Riggs 2014-09-29 20:53:10 Re: Yet another abort-early plan disaster on 9.3