Quick Links

Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

From:	"Nathan Boley" <npboley(at)gmail(dot)com>
To:	"Zeugswetter Andreas OSB sIT" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>
Cc:	"Gregory Stark" <stark(at)enterprisedb(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
Date:	2008-06-10 15:51:10
Message-ID:	6fa3b6e20806100851o5ebcbb3cice2e11d4dde5cd9d@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>> > One more problem with low ndistinct values is that the condition might very well
>> > hit no rows at all. But Idea 1 will largely overestimate the number of hits.

Thats a good point, but I don't see a clear solution. Maybe we could
look at past queries
and keep track of how often they return empty result sets?

It seems that, in some ways, we care about the distribution of the
query values in addition to the column values...

>> > I think for low ndistinct values we will want to know the exact
>> > value + counts and not a bin. So I think we will want additional stats rows
>> > that represent "value 'a1' stats".
>>
>> Isn't that what our most frequent values list does?
>
> Maybe ? Do we have the relevant stats for each ?
> But the trick is to then exclude those values from the histogram bins.

Currently, the histogram is only made up of non-mcv values.

In response to

Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics at 2008-06-10 10:16:59 from Zeugswetter Andreas OSB sIT

Responses

Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics at 2008-06-10 18:32:40 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2008-06-10 15:51:52	Re: Timezone abbreviations - out but not in?
Previous Message	Tom Lane	2008-06-10 15:49:48	Re: Timezone abbreviations - out but not in?