Quick Links

Re: Collect frequency statistics for arrays

From:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Noah Misch <noah(at)leadboat(dot)com>, Nathan Boley <npboley(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Collect frequency statistics for arrays
Date:	2012-02-29 21:19:03
Message-ID:	CAPpHfdtXXzr99iDLM8uh_0NnZNj9qTzNb0dkOwmtNTdm080kcw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Mar 1, 2012 at 1:09 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Alexander Korotkov <aekorotkov(at)gmail(dot)com> writes:
> > On Thu, Mar 1, 2012 at 12:39 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I am starting to look at this patch now. I'm wondering exactly why the
> >> decision was made to continue storing btree-style statistics for arrays,
>
> > Probably, btree statistics really does matter for some sort of arrays?
> For
> > example, arrays representing paths in the tree. We could request a
> subtree
> > in a range query on such arrays.
>
> That seems like a pretty narrow, uncommon use-case. Also, to get
> accurate stats for such queries that way, you'd need really enormous
> histograms. I doubt that the existing parameters for histogram size
> will permit meaningful estimation of more than the first array entry
> (since we don't make the histogram any larger than we do for a scalar
> column).
>
> The real point here is that the fact that we're storing btree-style
> stats for arrays is an accident, backed into by having added btree
> comparators for arrays plus analyze.c's habit of applying default
> scalar-oriented analysis functions to any type without an explicit
> typanalyze entry. I don't recall that we ever thought hard about
> it or showed that those stats were worth anything.
>

OK. I don't object to removing btree stats from arrays.
What do you thinks about pg_stats view in this case? Should it combine
values histogram and array length histogram in single column like do for
MCV and MCELEM?

------
With best regards,
Alexander Korotkov.

In response to

Re: Collect frequency statistics for arrays at 2012-02-29 21:09:51 from Tom Lane

Responses

Re: Collect frequency statistics for arrays at 2012-03-01 14:57:17 from Alexander Korotkov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2012-02-29 21:19:17	Re: COPY with hints, rebirth
Previous Message	Tom Lane	2012-02-29 21:09:51	Re: Collect frequency statistics for arrays