Skip site navigation (1) Skip section navigation (2)

Re: Increasing statistics results in worse estimates

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Shelby Cain <alyandon(at)yahoo(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Increasing statistics results in worse estimates
Date: 2005-04-29 16:15:38
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-general
Shelby Cain <alyandon(at)yahoo(dot)com> writes:
> --- Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> What did you increase it to, exactly?  Could we see
>> the contents of
>> pg_stats for these two columns at both target
>> settings?

> Generally, the more I increased the stats target the
> better the correlation estimate and the worse the row estimate.

Hm.  I should have realized why correlation wouldn't be high for the
city name: given the ordering by zipcode, city name values may be
pretty well clumped, but they aren't in any kind of alphabetical
order --- and it's the overall ordering, not the clumping, that
correlation measures.

However, there is something absolutely wacko about the stats collection
process here ... you've got fairly reasonable looking results for
most-common-values of city name at the lower end of the stats settings
(HOUSTON and DALLAS are the most common, sounds about right) ... but at
the higher settings the ordering of most-common entries just goes nuts.
We've got some kind of bug there.

What exactly are you changing in the different cases ---
default_statistics_target, or are you doing an ALTER TABLE on some
of the columns (if so which)?

It might be easier to debug this if you could send me the test case.
Any problem with sending just the city name and zipcode columns
of the table (offlist of course)?  COPY TO with a column list can
extract that for you.

			regards, tom lane

pgsql-general by date

Next:From: Tom LaneDate: 2005-04-29 16:21:43
Subject: Re: Composite types as columns used in production?
Previous:From: James RobinsonDate: 2005-04-29 15:49:40
Subject: Composite types as columns used in production?

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group