Re: Maximum statistics target

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Subject: Re: Maximum statistics target
Date: 2008-03-10 13:43:09
Message-ID: 87ejaibspe.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> Am Freitag, 7. März 2008 schrieb Tom Lane:
>>> IIRC, egjoinsel is one of the weak spots, so tests involving planning of
>>> joins between two tables with large MCV lists would be a good place to
>>> start.
>
>> I have run tests with joining two and three tables with 10 million rows each,
>> and the planning times seem to be virtually unaffected by the statistics
>> target, for values between 10 and 800000.
>
> It's not possible to believe that you'd not notice O(N^2) behavior for N
> approaching 800000 ;-). Perhaps your join columns were unique keys, and
> thus didn't have any most-common-values?

We could remove the hard limit on statistics target and impose the limit
instead on the actual size of the arrays. Ie, allow people to specify larger
sample sizes and discard unreasonably large excess data (possibly warning them
when that happens).

That would remove the screw case the original poster had where he needed to
scan a large portion of the table to see at least one of every value even
though there were only 169 distinct values.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-03-10 14:01:57 Re: Include Lists for Text Search
Previous Message Andrew Dunstan 2008-03-10 13:42:43 Re: Include Lists for Text Search