Quick Links

Re: -HEAD planner issue wrt hash_joins on dbt3 ?

From:	Mark Cave-Ayland <mark(dot)cave-ayland(at)ilande(dot)co(dot)uk>
To:	Matteo Beccati <php(at)beccati(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: -HEAD planner issue wrt hash_joins on dbt3 ?
Date:	2006-09-19 13:55:38
Message-ID:	1158674138.5800.15.camel@mca-desktop
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, 2006-09-18 at 17:46 +0200, Matteo Beccati wrote:
> Tom Lane ha scritto:
> > Matteo Beccati <php(at)beccati(dot)com> writes:
> >> I cannot see anything bad by using something like that:
> >> if (histogram is large/representative enough)
> >
> > Well, the question is exactly what is "large enough"? I feel a bit
> > uncomfortable about applying the idea to a histogram with only 10
> > entries (especially if we ignore two of 'em). With 100 or more,
> > it sounds all right. What's the breakpoint?
>
> Yes, I think 100-200 could be a good breakpoint. I don't actually know
> what is the current usage of SET STATISTICS, I usually set it to 1000
> for columns which need more precise selectivity.
>
> The breakpoint could be set even higher (500?) so there is space to
> increase statistics without enabling the histogram check, but I don't
> feel very comfortable though suggesting this kind of possibly
> undocumented side effect...

Hi everyone,

You may be interested to have a look at the statistics collector for the
geometry type within PostGIS. In order to prevent very large or very
small geometries from ruining the statistics histogram and generating
incorrect query plans, we make the assumption that the column
distribution is likely to be close to normal, and then remove any
ANALYZE-collected geometries from the set that lie outside +/- 3.25
standard deviations from the mean before creating the final histogram
(removes just under 1% of the data from each end of an assumed normal
distribution). This works well and AFAIK we've only ever had one
reported case of an incorrect query plan being generated using this
method.

Kind regards,

Mark.

In response to

Re: -HEAD planner issue wrt hash_joins on dbt3 ? at 2006-09-18 15:46:24 from Matteo Beccati

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2006-09-19 13:59:41	Re: system cache and buffer cache
Previous Message	Jim C. Nasby	2006-09-19 13:53:05	Re: [PATCHES] Patch for UUID datatype (beta)