Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Nathan Boley" <npboley(at)gmail(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, "Zeugswetter Andreas OSB sIT" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
Date: 2008-06-10 23:19:36
Message-ID: 87skvk3mk7.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> (In fact, I don't think the plan would change, in this case. The reason
> for the clamp to 1 row is to avoid foolish results for join situations.)

The screw case I've seen is when you have a large partitioned table where
constraint_exclusion fails to exclude the irrelevant partitions. You're going
to get 0 rows from all but the one partition which contains the 1 row you're
looking for. But since each partition is clamped to 1 you end up with an
estimate of a few hundred rows coming out of the Append node.

The natural way to kill this is to allow fractional rows for these scans. We
know they're usually going to be producing 0 so if the estimates produced the
right average expected value the sum would add up to 1 and the Append node
would get the right value.

Alternatively we could make Append more clever about estimating the number of
rows it produces. Somehow it could be informed of some holistic view of the
quals on its children and how they're inter-dependent. If it's told that only
one of its children will produce rows then it can use max() instead of sum()
to calculate its rows estimate.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-06-10 23:28:28 Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
Previous Message Tom Lane 2008-06-10 23:03:43 Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics