Re: Group-count estimation statistics

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Manfred Koizar <mkoi-pg(at)aon(dot)at>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Group-count estimation statistics
Date: 2005-02-01 16:15:01
Message-ID: 15045.1107274501@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Manfred Koizar <mkoi-pg(at)aon(dot)at> writes:
> On Mon, 31 Jan 2005 14:40:08 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Oh, I see, you want a "max" calculation in there too. Seems reasonable.
>> Any objections?

> Yes. :-( What I said is only true in the absence of any WHERE clause
> (or join). Otherwise the same cross-column correlation issues you tried
> to work around with the N/10 clamping might come back through the
> backdoor. I'm not sure whether coding for such a narrow use case is
> worth the trouble. Forget my idea.

No, I think it's still good. The WHERE clauses are factored in
separately (essentially by assuming their selectivity on the grouped
rows is the same as it would be on the raw rows, which is pretty bogus
but it's hard to do better). The important point is that the group
count before WHERE filtering certainly does behave as you suggest,
and so the clamp is going to be overoptimistic if it clamps to less than
the largest individual number-of-distinct-values.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2005-02-01 16:19:02 Re: Allow GRANT/REVOKE permissions to be applied to all schema
Previous Message Tom Lane 2005-02-01 16:08:27 Re: [NOVICE] Last ID Problem