Re: Specifying attribute slot for storing/reading statistics

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Esteban Zimanyi <ezimanyi(at)ulb(dot)ac(dot)be>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Specifying attribute slot for storing/reading statistics
Date: 2019-09-05 15:11:24
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Esteban Zimanyi <ezimanyi(at)ulb(dot)ac(dot)be> writes:
> We are developing the analyze/selectivity functions for those types. Our
> approach is to use the standard PostgreSQL/PostGIS functions for the value
> and the time dimensions where the slots starting from 0 will be used for
> the value dimension, and the slots starting from 2 will be used for the
> time dimension. For example, for tfloat we use range_typanalyze and related
> functions for
> * collecting in slots 0 and 1, STATISTIC_KIND_BOUNDS_HISTOGRAM
> and STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM for the float ranges of the value
> dimension
> * collecting in slots 2 and 3, STATISTIC_KIND_BOUNDS_HISTOGRAM
> and STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM for the periods (similar to
> tstzranges) of the time dimension

IMO this is fundamentally wrong, or at least contrary to the design
of pg_statistic. It is not supposed to matter which "slot" a given
statistic type is actually stored in; rather, readers are supposed to
search for the desired statistic type using the stakindN, staopN and
(if relevant) stacollN fields.

In this case it seems like it'd be reasonable to rely on the staop
fields to distinguish between the value and time dimensions, since
(IIUC) they're of different types.

Another idea is to invent your own slot kind identifiers instead of
using built-in ones. I'm not sure that there's any point in using
the built-in kind values, since (a) none of the core selectivity code
is likely to get called on your data and (b) even if it were, it'd
likely do the wrong thing. See the comments in pg_statistic.h,
starting about line 150, about assignment of non-built-in slot kinds.

> Is there any chance that the API for accessing the typanalyze and
> selectivity functions will be enhanced in a future release ?

Well, maybe you could convince us that the stakind/staop scheme for
identifying statistics is inadequate so we need another identification
field (corresponding to a component of the column being described,
perhaps). I'd be strongly against assigning any semantic meaning
to the slot numbers, though. That's likely to break code that's
written according to existing conventions.

regards, tom lane

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera from 2ndQuadrant 2019-09-05 15:25:02 Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT
Previous Message Alvaro Herrera 2019-09-05 14:52:25 Re: tableam vs. TOAST