Re: Yet another abort-early plan disaster on 9.3

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Yet another abort-early plan disaster on 9.3
Date: 2014-10-02 08:19:30
Message-ID: CA+U5nMKqqPNpSr2CQ2V0=9bjWVv-OA28o3BoxYP=7YVfyswdzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 1 October 2014 19:56, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 09/30/2014 04:01 PM, Simon Riggs wrote:
>> On 30 September 2014 18:28, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>>
>>>> Anyway, in the particular case I posted fixing n_distinct to realistic
>>>> numbers (%) fixed the query plan.
>>>
>>>
>>> But wouldn't fixing the absolute number also have fixed the plan?
>>
>> There are two causes of this issue.
>>
>> 1. Poor estimates of n_distinct. Fixable by user.
>>
>> 2. Poor assumption of homogeneous distribution. No way for user to
>> fix. Insufficient stats detail to be able to solve in current planner.
>>
>> I see (2) as the main source of issues, since as we observe, (1) is fixable.
>
> I disagree that (1) is not worth fixing just because we've provided
> users with an API to override the stats. It would unquestionably be
> better for us to have a better n_distinct estimate in the first place.
> Further, this is an easier problem to solve, and fixing n_distinct
> estimates would fix a large minority of currently pathological queries.
> It's like saying "hey, we don't need to fix the leak in your radiator,
> we've given you a funnel in the dashboard you can pour water into."

Having read papers on it, I believe the problem is intractable. Coding
is not the issue. To anyone: please prove me wrong, in detail, with
references so it can be coded.

> I do agree that (2) is worth fixing *as well*. In a first
> approximation, one possibility (as Tom suggests) would be to come up
> with a mathematical model for a selectivity estimate which was somewhere
> *between* homogenous distribution and the worst case. While that
> wouldn't solve a lot of cases, it would be a start towards having a
> better model.

This may have a reasonable solution, but I don't know it. A more
accurate mathematical model will still avoid the main problem: it is a
guess, not certain knowledge and the risk will still remain.

>>> I don't think correlation is up to the task as a complete solution, although
>>> it might help a little. There is no way a simple correlation can encode
>>> that John retired 15 years ago and hasn't logged on since, while Johannes
>>> was hired yesterday and never logged on before then.
>>
>> Ah, OK, essentially the same example.
>>
>> Which is why I ruled out correlation stats based approaches and
>> suggested a risk-weighted cost approach.
>
> By "risk-weighted" you mean just adjusting cost estimates based on what
> the worst case cost looks like, correct? That seemed to be your
> proposal from an earlier post. If so, we're in violent agreement here.

I proposed a clear path for this earlier in the thread and received no
comments as yet. Please look at that.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2014-10-02 08:47:39 Re: Escaping from blocked send() reprised.
Previous Message Peter Geoghegan 2014-10-02 07:54:19 Re: UPSERT wiki page, and SQL MERGE syntax

Browse pgsql-performance by date

  From Date Subject
Next Message Peter Geoghegan 2014-10-02 09:30:02 Re: Yet another abort-early plan disaster on 9.3
Previous Message Dorian Hoxha 2014-10-02 07:42:35 Re: auto vaccum is dying