Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Nathan Boley <npboley(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
Date: 2008-06-09 15:11:09
Message-ID: 1213024269.7180.104.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2008-06-08 at 19:03 -0400, Tom Lane wrote:
> Your argument seems to consider only columns having a normal
> distribution. How badly does it fall apart for non-normal
> distributions? (For instance, Zipfian distributions seem to be pretty
> common in database work, from what I've seen.)
>

If using "Idea 1: Keep an array of stadistinct that correspond to each
bucket size," I would expect it to still be a better estimate than it is
currently, because it's keeping a separate ndistinct for each histogram
bucket.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-06-09 15:12:59 Re: pg_dump restore time and Foreign Keys
Previous Message Tom Lane 2008-06-09 15:09:14 Potential deadlock with auto-analyze