Re: Odd statistics behaviour in 7.2

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Gordon A(dot) Runkle" <gar(at)integrated-dynamics(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Odd statistics behaviour in 7.2
Date: 2002-02-16 17:17:33
Message-ID: 20184.1013879853@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Gordon A. Runkle" <gar(at)integrated-dynamics(dot)com> writes:
> Is "-0.503824" the same as "503824 with a predicted increase in the
> number of distinct values" (as opposed to using "-503824")?

No, it means "0.503824 times the number of rows in the table".
Although your table was ~ 1 million rows, so that's approximately
right in your case.

Given the stats you cited, the exactly correct stadistinct value would
be -0.9348085. In testing I got -1, -0.808612, -0.678641, or once
-0.584611 from your data, depending on whether the sample chanced to
find none, one, two, or three repeated values. Any of these strike me
as plenty close enough for statistical purposes. But the Chaudhuri
estimator was off by more than a factor of 10.

> Are you planning to include this patch in v7.2.1, or would it require
> too much testing by others?

I'm going to put it in 7.2.1 unless there are objections.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-02-16 17:57:19 Re: Odd statistics behaviour in 7.2
Previous Message Alejandro Rivadeneira 2002-02-16 16:33:09 PostgreSQL Spanish manuals , files & links