Re: Yet another abort-early plan disaster on 9.3

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Yet another abort-early plan disaster on 9.3
Date: 2014-09-30 17:28:02
Message-ID: CAMkU=1wCMLWZexqVUPXWFeKmPPUDMv6GEEFUVDQnb7gfW9Wamw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Mon, Sep 29, 2014 at 2:54 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:

> On 09/26/2014 01:06 AM, Simon Riggs wrote:
> > On 23 September 2014 00:56, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> >
> >> We've hashed that out a bit, but frankly I think it's much more
> >> profitable to pursue fixing the actual problem than providing a
> >> workaround like "risk", such as:
> >>
> >> a) fixing n_distinct estimation
> >> b) estimating stacked quals using better math (i.e. not assuming total
> >> randomness)
> >> c) developing some kind of correlation stats
> >>
> >> Otherwise we would be just providing users with another knob there's no
> >> rational way to set.
> >
> > I believe this is a serious issue for PostgreSQL users and one that
> > needs to be addressed.
> >
> > n_distinct can be fixed manually, so that is less of an issue.
>
> It's an issue for the 99.8% of our users who don't know what n_distinct
> is, let alone how to calculate it. Also, changing it requires an
> exclusive lock on the table. Of course, you and I have been over this
> issue before.
>

If 99.6% of our users don't have a problem with n_distinct in their system,
that would mean that only 50% of the people with the problem don't know how
to solve it. And those people can usually get excellent free help on the
internet.

But if the problem not with n_distinct, but rather with most_common_freqs
(which I encounter more often than problems with n_distinct), all I can do
is shrug and say "yeah I know about that problem. Either crank up
statistics target as high as it will go, or it sucks to be you."

>
> One thing I'm wondering is why our estimator is creates n_distinct as a
> % so seldom. Really, any time n_distinct is over 10K we should be
> estimating a % instead. Now, estimating that % has its own issues, but
> it does seem like a peculiar quirk of our stats model.
>
> Anyway, in the particular case I posted fixing n_distinct to realistic
> numbers (%) fixed the query plan.
>

But wouldn't fixing the absolute number also have fixed the plan? If you
are going to set a number manually and then nail it in place so that
analyze stops changing it, then I can certainly see how the fractional
method is desirable. But if the goal is not to do that but have the
correct value estimated in the first place, I don't really see much benefit
from converting the estimate into a fraction and then back again.

> >
> > The problem, as I see it, is different. We assume that if there are
> > 100 distinct values and you use LIMIT 1 that you would only need to
> > scan 1% of rows. We assume that the data is arranged in the table in a
> > very homogenous layout. When data is not, and it seldom is, we get
> > problems.
> >
> > Simply put, assuming that LIMIT will reduce the size of all scans is
> > just way wrong. I've seen many plans where increasing the LIMIT
> > dramatically improves the plan.
> >
> > If we can at least agree it is a problem, we can try to move forwards.
>

I don't think anyone doubts there is a problem (many more than one of
them), there is just disagreement about the priority and what can be done
about it.

>
> That is certainly another problem. Does correlation stat figure in the
> LIMIT calculation at all, currently? That's what correlation stat is
> for, no?
>

I don't think correlation is up to the task as a complete solution,
although it might help a little. There is no way a simple correlation can
encode that John retired 15 years ago and hasn't logged on since, while
Johannes was hired yesterday and never logged on before then.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-09-30 17:32:12 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Previous Message Josh Berkus 2014-09-30 17:17:52 Re: open items for 9.4

Browse pgsql-performance by date

  From Date Subject
Next Message Gavin Flower 2014-09-30 21:11:20 Re: Yet another abort-early plan disaster on 9.3
Previous Message Graeme B. Bell 2014-09-30 17:07:09 Re: Yet another abort-early plan disaster on 9.3