Re: RFC: planner statistics in 7.2

From: Philip Warner <pjw(at)rhyme(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: RFC: planner statistics in 7.2
Date: 2001-04-20 00:44:05
Message-ID: 3.0.5.32.20010420104405.02b2ce60@mail.rhyme.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At 18:37 19/04/01 -0400, Tom Lane wrote:
>(2) Statistics should be computed on the basis of a random sample of the
>target table, rather than a complete scan. According to the literature
>I've looked at, sampling a few thousand tuples is sufficient to give good
>statistics even for extremely large tables; so it should be possible to
>run ANALYZE in a short amount of time regardless of the table size.

This sounds great; can the same be done for clustering. ie. pick a random
sample of index nodes, look at the record pointers and so determine how
well clustered the table is?

>A simple approach would be a SET
>variable or explicit parameter for ANALYZE. But I am inclined to think
>that it'd be better to create a persistent per-column state for this,
>set by say
> ALTER TABLE tab SET COLUMN col STATS COUNT n

Sounds fine - user-selectability at the column level seems a good idea.
Would there be any value in not making it part of a normal SQLxx statement,
and adding an 'ALTER STATISTICS' command? eg.

ALTER STATISTICS FOR tab[.column] COLLECT n
ALTER STATISTICS FOR tab SAMPLE m

etc.

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2001-04-20 00:48:57 Re: RFC: planner statistics in 7.2
Previous Message Tom Lane 2001-04-19 23:10:27 Re: RFC: planner statistics in 7.2y