From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Gregory Stark <stark(at)enterprisedb(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "jd(at)commandprompt(dot)com" <jd(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: benchmarking the query planner |
Date: | 2008-12-11 23:44:04 |
Message-ID: | 1229039044.13078.193.camel@hp_dx2400_1 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 2008-12-11 at 22:29 +0000, Gregory Stark wrote:
> > And I would like it even more if the sample size increased according
> to table size, since that makes ndistinct values fairly random for
> large
> > tables.
>
> Unfortunately _any_ ndistinct estimate based on a sample of the table
> is going to be pretty random.
We know that constructed data distributions can destroy the
effectiveness of the ndistinct estimate and make sample size irrelevant.
But typical real world data distributions do improve their estimations
with increased sample size and so it is worthwhile.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2008-12-11 23:47:51 | Re: benchmarking the query planner |
Previous Message | Tom Lane | 2008-12-11 23:43:48 | Re: benchmarking the query planner |