Revisiting default_statistics_target

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Revisiting default_statistics_target
Date: 2009-05-22 16:27:33
Message-ID: alpine.GSO.2.01.0905221149310.6633@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Yesterday Jignesh Shah presented his extensive benchmark results comparing
8.4-beta1 with 8.3.7 at PGCon:
http://blogs.sun.com/jkshah/entry/pgcon_2009_performance_comparison_of

While most cases were dead even or a modest improvement, his dbt-2 results
suggest a 15-20% regression in 8.4. Changing the default_statistics_taget
to 100 was responsible for about 80% of that regression. The remainder
was from the constraint_exclusion change. That 80/20 proportion was
mentioned in the talk but not in the slides. Putting both those back to
the 8.3 defaults swapped things where 8.4b1 was ahead by 5% instead.
(Note that all of the later benchmarks in his slides continued to use the
default parameters, that change was only tested with that specific
workload)

The situation where the stats target being so low hurts things the most
are the data warehouse use cases. Josh Berkus tells me that his latest DW
testing suggests that the 10->100 increase turns out to be insufficient
anyway; 400+ is the range you really need that to be in. I did a quick
survey of some other community members who work in this space and that
experience is not unique. Josh has some early tools that tackle this
problem by adjusting the stats target only when it's critical--on indexed
columns for example. I'm going to work with him to help get those
polished, and to see if we can replicate some of those cases via a public
benchmark.

The bump from 10 to 100 was supported by microbenchmarks that suggested it
would be tolerable. That doesn't seem to be reality here though, and it's
questionable whether this change really helps the people who need to fool
with the value the most. This sort of feedback is exactly why it made
sense to try this out during the beta cycle. But unless someone has some
compelling evidence to the contrary, it looks like the stats target needs
to go back to a lower value. I think the best we can do here is to
improve the documentation about this parameter and continue to work on
tuning guides and tools to help people set it correctly.

As for the change to constraint_exclusion, the regression impact there is
much less severe and the downside of getting it wrong is pretty bad.
Rather than reverting it, the ideal response to that might be to see if
it's possible to improve the "partition" code path. But as I'm not going
to volunteer to actually do that, I really don't get a vote here anyway.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Sabino Mullane 2009-05-22 16:43:42 Re: Revisiting default_statistics_target
Previous Message Greg Stark 2009-05-22 16:11:46 Re: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE