Re: hist boundary duplicates bug in head and 8.3

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Nathan Boley <npboley(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hist boundary duplicates bug in head and 8.3
Date: 2009-01-07 10:39:32
Message-ID: 1231324772.15005.153.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Tue, 2009-01-06 at 18:40 -0500, Tom Lane wrote:
> "Nathan Boley" <npboley(at)gmail(dot)com> writes:
> >> I don't think this is a bug.
>
> > hmmm... Well, I assumed it was a bug from a comment in analyze.
>
> > From ( near ) line 2130 in analyze.c
>
> > * least 2 instances in the sample. Also, we won't suppress values
> > * that have a frequency of at least 1/K where K is the intended
> > * number of histogram bins; such values might otherwise cause us to
> > * emit duplicate histogram bin boundaries.
>
> That's talking about a case where we have a choice whether to include a
> value in the MCV list or not. Once the MCV list is maxed out, we can't
> do anything to avoid duplicates.

Surely the most important point in the OP was that ineqsel does not
correctly binary search in the presence of duplicates.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Boley 2009-01-07 11:03:18 Re: hist boundary duplicates bug in head and 8.3
Previous Message Martin Pihlak 2009-01-07 10:36:50 Re: reducing statistics write overhead