Re: Bad Plan for Questionnaire-Type Query

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Blewett <david(at)dawninglight(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Bad Plan for Questionnaire-Type Query
Date: 2009-06-06 02:02:18
Message-ID: 603c8f070906051902i118af4fbp80fb41d4714b8df6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Jun 5, 2009 at 8:29 PM, David Blewett<david(at)dawninglight(dot)net> wrote:
> On Fri, Jun 5, 2009 at 7:32 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> My first thought would be to increase statistics dramatically on the
>> filtered columns in hopes of making PG realize there's a lot of rows there;
>> it's off by 8x.  Correlations stats are an ongoing issue in PostgreSQL.
>
> I started at a stats_target of 250, then tried 500 and finally the
> plan that I pasted before resorting to disabling nestloops was at 1000
> (and re-analyzing in between of course). Will a CLUSTER or REINDEX
> help at all?

Probably not. Your problem is similar to the one Anne Rosset was
complaining about on -performance a couple of days ago, though your
case is appears to be more complex.

http://archives.postgresql.org/pgsql-performance/2009-06/msg00023.php

It's really not clear what to do about this problem. In Anne's case,
it would probably be enough to gather MCVs over the product space of
her folder_id and is_deleted columns, but I'm not certain that would
help you. It almost seems like we need a way to say "for every
distinct value that appears in column X, you need to gather separate
statistics for the other columns of the table". But that could make
statistics gathering and query planning very expensive.

Another angle of attack, which we've talked about before, is to teach
the executor that when a nestloop with a hash-joinable condition
executes too many times, it should hash the inner side on the next
pass and then switch to a hash join.

But none of this helps you very much right now...

...Robert

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2009-06-06 02:17:51 Re: degenerate performance on one server of 3
Previous Message Erik Aronesty 2009-06-06 01:50:22 Re: degenerate performance on one server of 3