Re: Improving N-Distinct estimation by ANALYZE

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving N-Distinct estimation by ANALYZE
Date: 2006-01-04 19:49:16
Message-ID: 6579.1136404156@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> [ ... a large amount of analysis based on exactly one test case ... ]

I think you are putting too much emphasis on fixing one case and not
enough on considering what may happen in other cases ...

In general, estimating n-distinct from a sample is just plain a hard
problem, and it's probably foolish to suppose we'll ever be able to
do it robustly. What we need is to minimize the impact when we get
it wrong. So I agree with the comment that we need to finish the
unfinished project of making HashAggregate tables expansible, but
I'm dubious about the rest of this.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2006-01-04 20:07:37 Re: psql & readline & win32
Previous Message Josh Berkus 2006-01-04 19:41:36 Re: Inconsistent syntax in GRANT