Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3

From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig James <craig_james(at)emolecules(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance Anomaly with "col in (A,B)" vs. "col = A OR col = B" ver. 9.0.3
Date: 2011-09-27 00:09:37
Message-ID: 4E811441.6060604@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 27/09/2011 1:35 AM, Tom Lane wrote:
> Craig James<craig_james(at)emolecules(dot)com> writes:
>> On 9/26/11 10:07 AM, Tom Lane wrote:
>>> Cranking up the statistics target for the hts_code_id column (and re-ANALYZEing) ought to fix it. If all your tables are this large you might want to just increase default_statistics_target across the board. regards, tom lane
>> This is common advice in this forum .... but what's the down side to increasing statistics? With so many questions coming to this forum that are due to insufficient statistics, why not just increase the default_statistics_target? I assume there is a down side, but I've never seen it discussed. Does it increase planning time? Analyze time? Take lots of space?
> Yes, yes, and yes. We already did crank up the default
> default_statistics_target once (in 8.4), so I'm hesitant to do it again.

This has me wondering about putting together a maintenance/analysis tool
that generates and captures stats from several ANALYZE runs and compares
them to see if they're reasonably consistent. It then re-runs with
higher targets as a one-off, again to see if the stats agree, before
restoring the targets to defaults. The tool could crunch comparisons of
the resulting stats and warn about tables or columns where the default
stats targets aren't sufficient.

In the long run this might even be something it'd be good to have Pg do
automatically behind the scenes (like autovacuum) - auto-raise stats
targets where repeat samplings are inconsistent.

Thoughts? Is this reasonable to explore, or a totally bogus idea? I'll
see if I can have a play if there's any point to trying it out.

--
Craig Ringer

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2011-09-27 00:39:23 Re: Performance Anomaly with "col in (A, B)" vs. "col = A OR col = B" ver. 9.0.3
Previous Message Filip Rembiałkowski 2011-09-26 23:45:59 Re: slow query on tables with new columns added.