Skip site navigation (1) Skip section navigation (2)

Re: Collect frequency statistics for arrays

From: Noah Misch <noah(at)leadboat(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Nathan Boley <npboley(at)gmail(dot)com>,pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collect frequency statistics for arrays
Date: 2011-12-29 16:35:00
Message-ID: 20111229163500.GA647@tornado.leadboat.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Wed, Nov 09, 2011 at 08:49:35PM +0400, Alexander Korotkov wrote:
> Rebased with head.

I took a look at this patch.  The basic approach seems sensible.  For array
columns, ANALYZE will now determine the elements most often present in the
column's arrays.  It will also calculate a histogram from the counts of
distinct elements in each array.  That is to say, both newly-collected
statistics treat {1,1,2,3,1} and {3,1,2} identically.  New selectivity
machinery uses the new statistics to optimize these operations:

column @> const
column <@ const
column && const
const = ANY (column)
const = ALL (column)

Concrete estimates look mostly sane, with a few corner cases I note below.


We have many implementation details to nail down.

With the patch applied, I get the attached regression.diffs from "make check".
The palloc errors indicate bugs, but rules.out just needs a refresh.

During ANALYZE, we'll now detoast all array column values regardless of size,
just as we already do for tsvector columns.  That may be reasonable enough for
tsvector table/index columns, whose very presence is a hint that the user has
planned to use the value for searching.  Since arrays make no such
implication, should we skip large arrays to constrain TOAST I/O?  Say, skip
arrays larger than 100 KiB or 1 MiB?

I find distressing the thought of having two copies of the lossy sampling
code, each implementing the algorithm with different variable names and levels
of generality.  We might someday extend this to hstore, and then we'd have yet
another copy.  Tom commented[1] that ts_typanalyze() and array_typanalyze()
should remain distinct, and I agree.  However, they could call a shared
counting module.  Is that practical?  Possible API:

typedef struct LossyCountCtl;
LossyCountCtl *LossyCountStart(float s,
							   float epsilon,
							   int2 typlen,
							   bool typbyval,
							   Oid eqfunc); /* + hash func, a few others */
void LossyCountAdd(LossyCountCtl *ctl, Datum elem);
TrackItem **LossyCountGetAll(LossyCountCtl *ctl);

[1] http://archives.postgresql.org/message-id/12406.1298055475@sss.pgh.pa.us

> *** a/doc/src/sgml/catalogs.sgml
> --- b/doc/src/sgml/catalogs.sgml
> ***************
> *** 8253,8260 ****
>         <entry>
>          A list of the most common values in the column. (Null if
>          no values seem to be more common than any others.)
> !        For some data types such as <type>tsvector</>, this is a list of
> !        the most common element values rather than values of the type itself.
>         </entry>
>        </row>
>   
> --- 8253,8261 ----
>         <entry>
>          A list of the most common values in the column. (Null if
>          no values seem to be more common than any others.)
> !        For some data types such as <type>arrays</> and <type>tsvector</>,
> !        this is a list of the most common element values rather than values of
> !        the type itself.
>         </entry>
>        </row>
>   
> ***************
> *** 8266,8274 ****
>          A list of the frequencies of the most common values or elements,
>          i.e., number of occurrences of each divided by total number of rows.
>          (Null when <structfield>most_common_vals</structfield> is.)
> !        For some data types such as <type>tsvector</>, it can also store some
> !        additional information, making it longer than the
> !        <structfield>most_common_vals</> array.
>         </entry>
>        </row>
>   
> --- 8267,8275 ----
>          A list of the frequencies of the most common values or elements,
>          i.e., number of occurrences of each divided by total number of rows.
>          (Null when <structfield>most_common_vals</structfield> is.)
> !        For some data types such as <type>arrays</> and <type>tsvector</>,
> !        it can also store some additional information, making it longer than
> !        the <structfield>most_common_vals</> array.
>         </entry>
>        </row>

We're falsifying the above by splitting out that case into new columns
most_common_elems and most_common_elem_freqs.

>   
> ***************
> *** 8284,8289 ****
> --- 8285,8291 ----
>          does not have a <literal>&lt;</> operator or if the
>          <structfield>most_common_vals</> list accounts for the entire
>          population.)
> +        For <type>arrays</>, it holds histogram bounds of array lengths. 
>         </entry>
>        </row>

Likewise: that's now in the new column length_histogram_bounds.

We need documentation for the new pg_stats columns.  Also, in particular,
let's document the special entries at the end of most_common_freqs.

> *** a/src/backend/catalog/system_views.sql
> --- b/src/backend/catalog/system_views.sql
> ***************
> *** 117,145 **** CREATE VIEW pg_stats AS
>           stawidth AS avg_width,
>           stadistinct AS n_distinct,
>           CASE
> !             WHEN stakind1 IN (1, 4) THEN stavalues1
> !             WHEN stakind2 IN (1, 4) THEN stavalues2
> !             WHEN stakind3 IN (1, 4) THEN stavalues3
> !             WHEN stakind4 IN (1, 4) THEN stavalues4
>           END AS most_common_vals,
>           CASE
> !             WHEN stakind1 IN (1, 4) THEN stanumbers1
> !             WHEN stakind2 IN (1, 4) THEN stanumbers2
> !             WHEN stakind3 IN (1, 4) THEN stanumbers3
> !             WHEN stakind4 IN (1, 4) THEN stanumbers4
>           END AS most_common_freqs,
>           CASE
>               WHEN stakind1 = 2 THEN stavalues1
>               WHEN stakind2 = 2 THEN stavalues2
>               WHEN stakind3 = 2 THEN stavalues3
>               WHEN stakind4 = 2 THEN stavalues4
>           END AS histogram_bounds,
>           CASE
>               WHEN stakind1 = 3 THEN stanumbers1[1]
>               WHEN stakind2 = 3 THEN stanumbers2[1]
>               WHEN stakind3 = 3 THEN stanumbers3[1]
>               WHEN stakind4 = 3 THEN stanumbers4[1]
> !         END AS correlation
>       FROM pg_statistic s JOIN pg_class c ON (c.oid = s.starelid)
>            JOIN pg_attribute a ON (c.oid = attrelid AND attnum = s.staattnum)
>            LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace)
> --- 117,170 ----
>           stawidth AS avg_width,
>           stadistinct AS n_distinct,
>           CASE
> !             WHEN stakind1 = 1 THEN stavalues1
> !             WHEN stakind2 = 1 THEN stavalues2
> !             WHEN stakind3 = 1 THEN stavalues3
> !             WHEN stakind4 = 1 THEN stavalues4
> !             WHEN stakind5 = 1 THEN stavalues5
>           END AS most_common_vals,
>           CASE
> !             WHEN stakind1 = 1 THEN stanumbers1
> !             WHEN stakind2 = 1 THEN stanumbers2
> !             WHEN stakind3 = 1 THEN stanumbers3
> !             WHEN stakind4 = 1 THEN stanumbers4
> !             WHEN stakind5 = 1 THEN stanumbers5
>           END AS most_common_freqs,
>           CASE
>               WHEN stakind1 = 2 THEN stavalues1
>               WHEN stakind2 = 2 THEN stavalues2
>               WHEN stakind3 = 2 THEN stavalues3
>               WHEN stakind4 = 2 THEN stavalues4
> +             WHEN stakind5 = 2 THEN stavalues5
>           END AS histogram_bounds,
>           CASE
>               WHEN stakind1 = 3 THEN stanumbers1[1]
>               WHEN stakind2 = 3 THEN stanumbers2[1]
>               WHEN stakind3 = 3 THEN stanumbers3[1]
>               WHEN stakind4 = 3 THEN stanumbers4[1]
> !             WHEN stakind5 = 3 THEN stanumbers5[1]
> !         END AS correlation,
> !         CASE
> !             WHEN stakind1 = 4 THEN stavalues1
> !             WHEN stakind2 = 4 THEN stavalues2
> !             WHEN stakind3 = 4 THEN stavalues3
> !             WHEN stakind4 = 4 THEN stavalues4
> !             WHEN stakind5 = 4 THEN stavalues5
> !         END AS most_common_elems,
> !         CASE
> !             WHEN stakind1 = 4 THEN stanumbers1
> !             WHEN stakind2 = 4 THEN stanumbers2
> !             WHEN stakind3 = 4 THEN stanumbers3
> !             WHEN stakind4 = 4 THEN stanumbers4
> !             WHEN stakind5 = 4 THEN stanumbers5
> !         END AS most_common_elem_freqs,

I think this is an improvement, but some code out there may rely on the
ability to get stakind = 4 data from the most_common_vals column.  We'll need
to mention this in the release notes as an incompatibility.

> !         CASE
> !             WHEN stakind1 = 5 THEN stavalues1
> !             WHEN stakind2 = 5 THEN stavalues2
> !             WHEN stakind3 = 5 THEN stavalues3
> !             WHEN stakind4 = 5 THEN stavalues4
> !             WHEN stakind5 = 5 THEN stavalues5
> !         END AS length_histogram_bounds
>       FROM pg_statistic s JOIN pg_class c ON (c.oid = s.starelid)
>            JOIN pg_attribute a ON (c.oid = attrelid AND attnum = s.staattnum)
>            LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace)

> *** a/src/backend/commands/typecmds.c
> --- b/src/backend/commands/typecmds.c
> ***************
> *** 610,616 **** DefineType(List *names, List *parameters)
>   			   F_ARRAY_SEND,	/* send procedure */
>   			   typmodinOid,		/* typmodin procedure */
>   			   typmodoutOid,	/* typmodout procedure */
> ! 			   InvalidOid,		/* analyze procedure - default */
>   			   typoid,			/* element type ID */
>   			   true,			/* yes this is an array type */
>   			   InvalidOid,		/* no further array type */
> --- 610,616 ----
>   			   F_ARRAY_SEND,	/* send procedure */
>   			   typmodinOid,		/* typmodin procedure */
>   			   typmodoutOid,	/* typmodout procedure */
> ! 			   ArrayTypanalyzeOid,		/* special analyze procedure for arrays */
>   			   typoid,			/* element type ID */
>   			   true,			/* yes this is an array type */
>   			   InvalidOid,		/* no further array type */

The recently-added function DefineRange() needs the same change.

> *** /dev/null
> --- b/src/backend/utils/adt/array_sel.c

"array_selfuncs.c" would better match our informal convention.

> ***************
> *** 0 ****
> --- 1,948 ----
> + /*-------------------------------------------------------------------------
> +  *
> +  * array_sel.c
> +  *	  Functions for selectivity estimation of array operators.
> +  *
> +  * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
> +  *
> +  *
> +  * IDENTIFICATION
> +  *	  src/backend/tsearch/array_sel.c

Fix file location.

> +  *
> +  *-------------------------------------------------------------------------
> +  */
> + 
> + #include "postgres.h"
> + #include "access/hash.h"
> + #include "catalog/pg_operator.h"
> + #include "commands/vacuum.h"
> + #include "utils/builtins.h"
> + #include "utils/typcache.h"
> + #include "utils/array.h"
> + #include "catalog/pg_am.h"
> + #include "catalog/pg_collation.h"
> + #include "commands/defrem.h"
> + #include "utils/lsyscache.h"
> + #include "utils/selfuncs.h"

Sort the includes alphabetically, except for postgres.h coming first.

> + 
> + /* Default selectivity constant */
> + #define DEFAULT_CONT_SEL 0.005
> + 
> + /* Macro for selectivity estimation to be used if we have no statistics */
> + #define array_selec_no_stats(array,nitems,op) \
> + 	mcelem_array_selec(array, nitems, typentry, NULL, 0, NULL, 0, NULL, 0, op)
> + 
> + Datum		arraysel(PG_FUNCTION_ARGS);

extern prototypes go in header files, even when it's not strictly necessary.

> + 
> + static Selectivity calc_arraysel(VariableStatData *vardata, Datum constval,
> + 			  Oid operator);
> + static Selectivity mcelem_array_selec(ArrayType *array, int nitems, TypeCacheEntry *typentry,
> + 				   Datum *mcelem, int nmcelem, float4 *numbers, int nnumbers,
> + 				   Datum *hist, int nhist, Oid operator);
> + static int	element_compare(const void *key1, const void *key2);
> + bool		find_next_mcelem(Datum *mcelem, int nmcelem, Datum value, int *index);
> + static Selectivity mcelem_array_contain_overlap_selec(Datum *mcelem,
> +   int nmcelem, float4 *numbers, Datum *array_data, int nitems, Oid operator);
> + static float calc_hist(Datum *hist, int nhist, float *hist_part, int n);
> + static Selectivity mcelem_array_contained_selec(Datum *mcelem, int nmcelem,
> + 	  float4 *numbers, Datum *array_data, int nitems, Datum *hist, int nhist,
> + 							 Oid operator);
> + static float *calc_distr(float *p, int n, int m, float rest);
> + 
> + /* Compare function of element data type */
> + static FunctionCallInfoData cmpfunc;
> + 
> + /*
> +  * selectivity for "const = ANY(column)" and "const = ALL(column)"
> +  */
> + Selectivity
> + calc_scalararraysel(VariableStatData *vardata, Datum constval, bool orClause)

You have scalararraysel() calling this function for any operator (consider
"const < ANY(column)"), but it only handles a single operator: the "=" of the
default btree opclass used at ANALYZE time.  We could start by just returning
a constant selectivity for other operators, but we may be able to do better.
If the actual operator is the equality operator we used at ANALYZE time
(staop), use binary search on the mcelem array.  Otherwise, use linear search,
applying the operator to each MCE.  (If the benefits justify the trouble, you
could also use this strategy to support types with no default btree opclass.)

> + {
> + 	Oid			elemtype;
> + 	Selectivity selec;
> + 	TypeCacheEntry *typentry;
> + 	Datum	   *hist;
> + 	int			nhist;
> + 
> + 	elemtype = get_base_element_type(vardata->vartype);
> + 
> + 	/* Get default comparison function */
> + 	typentry = lookup_type_cache(elemtype,
> + 							  TYPECACHE_CMP_PROC | TYPECACHE_CMP_PROC_FINFO);
> + 	InitFunctionCallInfoData(cmpfunc, &typentry->cmp_proc_finfo, 2,
> + 							 DEFAULT_COLLATION_OID, NULL, NULL);

The default btree operator class that existed at ANALYZE time may no longer
exist.  If we don't find a cmp function here, punt to avoid a crash.

> + /*
> +  * arraysel -- restriction selectivity for "column @> const", "column && const"
> +  * and "column <@ const"
> +  */
> + Datum
> + arraysel(PG_FUNCTION_ARGS)
> + {
> + 	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
> + 
> + 	Oid			operator = PG_GETARG_OID(1);
> + 	List	   *args = (List *) PG_GETARG_POINTER(2);
> + 	int			varRelid = PG_GETARG_INT32(3);
> + 	VariableStatData vardata;
> + 	Node	   *other;
> + 	bool		varonleft;
> + 	Selectivity selec;
> + 	Oid			element_typeid;
> + 
> + 	/*
> + 	 * If expression is not variable = something or something = variable, then
> + 	 * punt and return a default estimate.
> + 	 */

The operator is never "=".

> + 	if (!get_restriction_variable(root, args, varRelid,
> + 								  &vardata, &other, &varonleft))
> + 		PG_RETURN_FLOAT8(DEFAULT_CONT_SEL);
> + 
> + 	/*
> + 	 * Can't do anything useful if the something is not a constant, either.
> + 	 */
> + 	if (!IsA(other, Const))
> + 	{
> + 		ReleaseVariableStats(vardata);
> + 		PG_RETURN_FLOAT8(DEFAULT_CONT_SEL);
> + 	}

Previously, we defaulted to 0.005 for operator "&&" (via areasel()) and 0.001
for operators "@>" and "<@" (via contsel()).  Surely some difference between
those cases remains appropriate.

> + 		if (res == 0)
> + 		{
> + 			*index = i;
> + 			return true;
> + 		}
> + 		else if (res < 0)
> + 		{
> + 			l = i + 1;
> + 		}
> + 		else
> + 		{
> + 			r = i - 1;
> + 		}

Throughout this patch, omit braces around single statements.

> + 	}
> + 	*index = l;
> + 	return false;
> + }
> + 
> + /*
> +  * Array selectivity estimation based on most common elements statistics.
> +  */
> + static Selectivity
> + mcelem_array_selec(ArrayType *array, int nitems, TypeCacheEntry *typentry,
> + 	  Datum *mcelem, int nmcelem, float4 *numbers, int nnumbers, Datum *hist,
> + 				   int nhist, Oid operator)
> + {

> + 	/* "column @> const" and "column && const" cases */
> + 	if (operator == OID_ARRAY_CONTAIN_OP || operator == OID_ARRAY_OVERLAP_OP)
> + 		return mcelem_array_contain_overlap_selec(mcelem, nmcelem, numbers,
> + 									   array_data, nonnull_nitems, operator);
> + 
> + 	/* "column <@ const" case */
> + 	if (operator == OID_ARRAY_CONTAINED_OP)
> + 		return mcelem_array_contained_selec(mcelem, nmcelem, numbers,
> + 						  array_data, nonnull_nitems, hist, nhist, operator);
> + 	return DEFAULT_CONT_SEL;

Returning a fixed selectivity when this gets attached to an unexpected
operator seems counterproductive.  Let's just elog(ERROR) in that case.

> + /*
> +  * Array selectivity estimation based on most common elements statistics for
> +  * "column @> const" and "column && const" cases. This estimation assumes
> +  * element occurences to be independent.
> +  */
> + static Selectivity
> + mcelem_array_contain_overlap_selec(Datum *mcelem, int nmcelem,
> + 				float4 *numbers, Datum *array_data, int nitems, Oid operator)

We could take some advantage of the unique element count histogram for "@>".
Any column value with fewer distinct elements than the constant array cannot
match.  We perhaps can't use both that fact and element frequency stats at the
same time, but we could use the lesser of the two probabilities.

For operator "&&" or operator "@>" with a nonempty constant array, no rows
having empty arrays will match.  Excepting the special case of "@>" with an
empty constant array, we can multiply the MCE-derived selectivity by the
fraction, based on the histogram, of nonempty arrays in the column.  (For
"<@", rows having empty arrays always match.)

I don't think it's mandatory that the initial commit implement the above, but
it's something to mention in the comments as a future direction.

> + /*
> +  * Let be n independent events with probabilities p. This function calculates
> +  * probabilities of exact k of events occurence for k in [0;m].
> +  * Imagine matrix M of (n + 1) x (m + 1) size. Element M[i,j] denotes
> +  * probability that exact j of first i events occurs. Obviously M[0,0] = 1.
> +  * Each next event increase total number of occured events if it occurs and
> +  * leave last value of that number if it doesn't occur. So, by the law of
> +  * total probability: M[i,j] = M[i - 1, j] * (1 - p[i]) + M[i - 1, j - 1] * p[i]
> +  * for i > 0, j > 0. M[i,0] = M[i - 1, 0] * (1 - p[i]) for i > 0.
> +  * Also there could be some events with low probabilities. Their summary
> +  * probability passed in the rest parameter.
> +  */
> + static float *
> + calc_distr(float *p, int n, int m, float rest)
> + {

> + 	/* Take care about events with low probabilities. */
> + 	if (rest > 0.0f)
> + 	{
> + 		/*
> + 		 * The probability of no occurence of events which forms "rest"
> + 		 * probability have a limit of exp(-rest) when number of events fo to
> + 		 * infinity. Another simplification is to replace that events with one
> + 		 * event with (1 - exp(-rest)) probability.
> + 		 */
> + 		rest = 1.0f - exp(-rest);

What is the name of the underlying concept in probability theory?

> + /*
> +  * Array selectivity estimation based on most common elements statistics for
> +  * "column <@ const" case. Assumption that element occurences are independent
> +  * means certain distribution of array lengths. Typically real distribution
> +  * of lengths is significantly different from it. For example, if even we
> +  * have set of arrays with 1 integer element in range [0;10] each, element
> +  * occurences are not independent. Because in the case of independence we

Do you refer to a column where '{1,12,46}' and '{13,7}' may appear, but
'{6,19,4}' cannot appear?

> +  * have probabilities of length of 0, 1, 2 etc. In the "column @> const"
> +  * and "column && const" cases we usually have "const" with low summary
> +  * frequency of elements (otherwise we have selectivity close to 0 or 1
> +  * correspondingly). That's why effect of dependence related to lengths
> +  * distribution id negligible there. In the "column <@ const" case summary
> +  * frequency of elements is high (otherwise we have selectivity close to 0).

What does the term "summary frequency" denote?

> +  * That's why we should do correction due to array lengths distribution.
> +  */
> + static Selectivity
> + mcelem_array_contained_selec(Datum *mcelem, int nmcelem, float4 *numbers,
> + 		 Datum *array_data, int nitems, Datum *hist, int nhist, Oid operator)

Break up the parameter list to avoid pgindent reverse-indenting the line.


When I tried a constant array with duplicate elements, I saw an inappropriate
report of 100% selectivity:

[local] test=# create table t1 as select array[n % 2, n] as arr from generate_series(1,100000) t(n);
SELECT 100000
[local] test=# analyze t1;
WARNING:  problem in alloc set Analyze: detected write past chunk end in block 0x7f189cbbf010, chunk 0x7f189cc1d0d8
ANALYZE
[local] test=# explain select * from t1 where arr <@ '{0,45}';
                       QUERY PLAN
--------------------------------------------------------
 Seq Scan on t1  (cost=0.00..1986.00 rows=186 width=29)
   Filter: (arr <@ '{0,45}'::integer[])
(2 rows)

[local] test=# explain select * from t1 where arr <@ '{0,45,45}';
                        QUERY PLAN
-----------------------------------------------------------
 Seq Scan on t1  (cost=0.00..1986.00 rows=100000 width=29)
   Filter: (arr <@ '{0,45,45}'::integer[])
(2 rows)

By contrast, the estimate in the non-duplicate case looks sane considering
these estimates for the individual elements:

[local] test=# explain select * from t1 where 0 = any (arr);
                        QUERY PLAN                        
----------------------------------------------------------
 Seq Scan on t1  (cost=0.00..2986.00 rows=49967 width=29)
   Filter: (0 = ANY (arr))
(2 rows)

[local] test=# explain select * from t1 where 45 = any (arr);
                       QUERY PLAN                       
--------------------------------------------------------
 Seq Scan on t1  (cost=0.00..2986.00 rows=500 width=29)
   Filter: (45 = ANY (arr))
(2 rows)


Incidentally, "const = ALL (column)" should be equivalent to "column <@
array[const]".  (Assuming the "=" operator chosen in the first statement is
the "=" operator of the array type's default btree opclass).  However, I get
significantly different estimates, with the latter getting a better estimate:

[local] test=# explain select * from t1 where 1 = all (arr);
                        QUERY PLAN
----------------------------------------------------------
 Seq Scan on t1  (cost=0.00..2986.00 rows=18407 width=29)
   Filter: (1 = ALL (arr))
(2 rows)

[local] test=# explain select * from t1 where arr <@ array[1];
                      QUERY PLAN
------------------------------------------------------
 Seq Scan on t1  (cost=0.00..1986.00 rows=1 width=29)
   Filter: (arr <@ '{1}'::integer[])
(2 rows)

> + 	/*
> + 	 * Rest is a average length of elements which aren't present in mcelem.
> + 	 */
> + 	rest = avg_length;

You define "rest" here as an array length ...

> + 
> + 	default_freq = Min(DEFAULT_CONT_SEL, minfreq / 2);
> + 
> + 	mcelem_index = 0;
> + 
> + 	/*
> + 	 * mult is the multiplier that presents estimate of probability that each
> + 	 * mcelem which is not present in constant doesn't occur.
> + 	 */
> + 	mult = 1.0f;
> + 
> + 	for (i = 0; i < nitems; i++)
> + 	{
> + 		bool		found = false;
> + 
> + 		/* Comparison with previous value in order to guarantee uniquness */
> + 		if (i > 0)
> + 		{
> + 			if (!element_compare(&array_data[i - 1], &array_data[i]))
> + 				continue;
> + 		}
> + 
> + 		/*
> + 		 * Iterate over mcelem until find mcelem that is greater or equal to
> + 		 * element of constant. Simultaneously taking care about rest and
> + 		 * mult. If that mcelem is found then fill corresponding elem_selec.
> + 		 */
> + 		while (mcelem_index < nmcelem)
> + 		{
> + 			int			cmp = element_compare(&mcelem[mcelem_index], &array_data[i]);
> + 
> + 			if (cmp < 0)
> + 			{
> + 				mult *= (1.0f - numbers[mcelem_index]);
> + 				rest -= numbers[mcelem_index];

... But here, you're subtracting a frequency from an array length?

> + /*
> +  * Comparison function for elements. Based on default comparison function for
> +  * array element data type.
> +  */
> + static int
> + element_compare(const void *key1, const void *key2)
> + {
> + 	const Datum *d1 = (const Datum *) key1;
> + 	const Datum *d2 = (const Datum *) key2;
> + 
> + 	cmpfunc.	arg[0] = *d1;
> + 	cmpfunc.	arg[1] = *d2;
> + 	cmpfunc.	argnull[0] = false;
> + 	cmpfunc.	argnull[1] = false;
> + 	cmpfunc.	isnull = false;
> + 
> + 	return DatumGetInt32(FunctionCallInvoke(&cmpfunc));
> + }

We could easily make this reentrant by passing the fcinfo through an argument
and using qsort_arg().  Please do so.

> *** /dev/null
> --- b/src/backend/utils/adt/array_typanalyze.c
> ***************
> *** 0 ****
> --- 1,834 ----
> + /*-------------------------------------------------------------------------
> +  *
> +  * array_typanalyze.c
> +  *	  functions for gathering statistics from array columns
> +  *
> +  * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
> +  *
> +  *
> +  * IDENTIFICATION
> +  *	  src/backend/tsearch/array_typanalyze.c

Fix file location.

> +  *
> +  *-------------------------------------------------------------------------
> +  */
> + 
> + #include "postgres.h"
> + #include "access/hash.h"
> + #include "catalog/pg_operator.h"
> + #include "commands/vacuum.h"
> + #include "commands/defrem.h"
> + #include "parser/parse_oper.h"
> + #include "utils/builtins.h"
> + #include "utils/hsearch.h"
> + #include "utils/typcache.h"
> + #include "utils/array.h"
> + #include "catalog/pg_am.h"
> + #include "catalog/pg_collation.h"
> + #include "utils/lsyscache.h"
> + #include "utils/selfuncs.h"

Alphabetize the includes after "postgres.h".

> + 
> + #define ARRAY_ANALYZE_CHECK_OID(x) \
> + 	if (!OidIsValid(x)) \
> + 	{ \
> + 		elog(INFO, "Can't collect statistics on %d array type. Array \
> + 					statistics collection requires default hash and btree \
> + 					opclasses for element type.", stats->attrtypid); \
> + 		stats->stats_valid = false; \
> + 		return; \
> + 	}

No message is necessary: the absence of a btree opclass or of both opclasses
degrades normal statistics, and we make no message about it then.

The decision to abandon the stats at this point causes a minor regression for
types having only a default hash opclass.  They currently get scalar minimal
stats; now they'll have none.  See below for one way to avoid that.

This macro is used in just one function, and the return statement means one
wouldn't lightly use it anywhere else.  Therefore, as a minor style point, I'd
place its definition right with the function that uses it rather than at the
head of the file.

> + Datum		array_typanalyze(PG_FUNCTION_ARGS);

extern prototypes go in header files, even when it's not strictly necessary.

> + /*
> +  *	array_typanalyze -- a custom typanalyze function for array columns
> +  */
> + Datum
> + array_typanalyze(PG_FUNCTION_ARGS)
> + {
> + 	VacAttrStats *stats = (VacAttrStats *) PG_GETARG_POINTER(0);
> + 	Form_pg_attribute attr = stats->attr;
> + 	Oid			ltopr;
> + 	Oid			eqopr;
> + 	StdAnalyzeData *mystats;
> + 
> + 	/* If the attstattarget column is negative, use the default value */
> + 	/* NB: it is okay to scribble on stats->attr since it's a copy */
> + 	if (attr->attstattarget < 0)
> + 		attr->attstattarget = default_statistics_target;
> + 
> + 	/* Look for default "<" and "=" operators for column's type */
> + 	get_sort_group_operators(stats->attrtypid,
> + 							 false, false, false,
> + 							 &ltopr, &eqopr, NULL,
> + 							 NULL);
> + 
> + 	/* If column has no "=" operator, we can't do much of anything */
> + 	if (!OidIsValid(eqopr))
> + 		return false;
> + 
> + 	/* Save the operator info for compute_stats routines */
> + 	mystats = (StdAnalyzeData *) palloc(sizeof(StdAnalyzeData));
> + 	mystats->eqopr = eqopr;
> + 	mystats->eqfunc = get_opcode(eqopr);
> + 	mystats->ltopr = ltopr;
> + 	stats->extra_data = mystats;

Instead of duplicating this initialization and exporting StdAnalyzeData,
compute_scalar_stats() and compute_minimal_stats() from analyze.c, I suggest
structuring things as follows.  Export only std_typanalyze() and call it here.
If it returns false, return false here, too.  Otherwise, proceed with the
additional lookups you currently do in compute_array_stats().  If you don't
find everything you need (default btree operator class, for example), just
return true; the analysis will continue with the standard scalar approach as
setup by std_typanalyze().  Otherwise, replace compute_stats and extra_data
with your own materials.

> + 
> + 	stats->compute_stats = compute_array_stats;
> + 	/* see comment about the choice of minrows in commands/analyze.c */
> + 	stats->minrows = 300 * attr->attstattarget;
> + 
> + 	PG_RETURN_BOOL(true);
> + }
> + 
> + /*
> +  *	compute_array_stats() -- compute statistics for a array column
> +  *
> +  *	This functions computes statistics that are useful for determining <@, &&,
> +  *	@> operations selectivity, along with the fraction of non-null rows and
> +  *	average width.
> +  *
> +  *	As an addition to the the most common values, as we do for most datatypes,
> +  *	we're looking for the most common elements and length histogram. In the
> +  *	case of relatively long arrays it might be more useful, because there most
> +  *	probably won't be any two rows with the same array and thus MCV has no
> +  *	much sense. With a list of the most common elements we can do a better job
> +  *	at figuring out <@, &&, @> selectivity. Arrays length histogram helps to
> +  *	"column <@ const" to be more precise.

The histogram addresses array distinct element counts, not array lengths.
That's exactly what we need for the selectivity calculations in question.
Please update the terminology throughout the patch, though.

> +  *
> +  *	The algorithm used is Lossy Counting, as proposed in the paper "Approximate
> +  *	frequency counts over data streams" by G. S. Manku and R. Motwani, in
> +  *	Proceedings of the 28th International Conference on Very Large Data Bases,
> +  *	Hong Kong, China, August 2002, section 4.2. The paper is available at
> +  *	http://www.vldb.org/conf/2002/S10P03.pdf
> +  *
> +  *	The Lossy Counting (aka LC) algorithm goes like this:
> +  *	Let s be the threshold frequency for an item (the minimum frequency we
> +  *	are interested in) and epsilon the error margin for the frequency. Let D
> +  *	be a set of triples (e, f, delta), where e is an element value, f is that
> +  *	element's frequency (actually, its current occurrence count) and delta is
> +  *	the maximum error in f. We start with D empty and process the elements in
> +  *	batches of size w. (The batch size is also known as "bucket size" and is
> +  *	equal to 1/epsilon.) Let the current batch number be b_current, starting
> +  *	with 1. For each element e we either increment its f count, if it's
> +  *	already in D, or insert a new triple into D with values (e, 1, b_current
> +  *	- 1). After processing each batch we prune D, by removing from it all
> +  *	elements with f + delta <= b_current.  After the algorithm finishes we
> +  *	suppress all elements from D that do not satisfy f >= (s - epsilon) * N,
> +  *	where N is the total number of elements in the input.  We emit the
> +  *	remaining elements with estimated frequency f/N.  The LC paper proves
> +  *	that this algorithm finds all elements with true frequency at least s,
> +  *	and that no frequency is overestimated or is underestimated by more than
> +  *	epsilon.  Furthermore, given reasonable assumptions about the input
> +  *	distribution, the required table size is no more than about 7 times w.
> +  *
> +  *	We set s to be the estimated frequency of the K'th element in a natural
> +  *	language's frequency table, where K is the target number of entries in
> +  *	the MCELEM array. We assume that the distribution of element frequencies
> +  *	follows Zipf's law with an exponent of 1.
> +  *
> +  *	Assuming Zipfian distribution, the frequency of the K'th element is equal
> +  *	to 1/(K * H(W)) where H(n) is 1/2 + 1/3 + ... + 1/n and W is the number of
> +  *	elements in the language.	Putting W as one million, we get roughly
> +  *	0.07/K. This gives s = 0.07/K.	We set epsilon = s/10, which gives bucket
> +  *	width w = K/0.007 and maximum expected hashtable size of about 1000 * K.

These last two paragraphs, adapted from ts_typanalyze.c, assume natural
language documents.  To what extent do these parameter choices remain sensible
for arbitrary data such as users may place in arrays?  In any event, we need a
different justification, even if it's just a hand-wavy justification.

If I'm following this correctly, this choice of "s" makes the algorithm
guaranteed to find only elements constituting >= 7% of the input elements.
Incidentally, isn't that far too high for natural language documents?  If the
English word "the" only constitutes 7% of typical documents, then this "s"
value would permit us to discard practically every word; we'd be left with
words read while filling the last bucket and words that happened to repeat
exceedingly often in the column.  I haven't tried to make a test case to
observe this problem; am I missing something?  (This question is largely
orthogonal to your patch.)

> +  *
> +  *	Note: in the above discussion, s, epsilon, and f/N are in terms of a
> +  *	element's frequency as a fraction of all elements seen in the input.
> +  *	However, what we actually want to store in the finished pg_statistic
> +  *	entry is each element's frequency as a fraction of all rows that it occurs
> +  *	in. Elements might be repeated in the same array. Since operators
> +  *	<@, &&, @> takes care only about element occurence itself and not about
> +  *	occurence count, function takes additional care about uniqueness of
> +  *	counting. Also we need to change the divisor from N to nonnull_cnt to get
> +  *	the number we want.

On the same tangent, why does ts_typanalyze() not deduplicate the same way?
The @@ operator has the same property.

> +  */
> + static void
> + compute_array_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,
> + 					int samplerows, double totalrows)
> + {
> + 	int			num_mcelem;
> + 	int			null_cnt = 0;
> + 
> + 	/*
> + 	 * We should count not only null array values, but also null array
> + 	 * elements
> + 	 */
> + 	int			null_elem_cnt = 0.0;
> + 
> + 	double		total_width = 0.0;
> + 	double		total_length = 0.0;
> + 
> + 	/* This is D from the LC algorithm. */
> + 	HTAB	   *elements_tab;
> + 	HASHCTL		elem_hash_ctl;
> + 	HASH_SEQ_STATUS scan_status;
> + 
> + 	/* This is the current bucket number from the LC algorithm */
> + 	int			b_current;
> + 
> + 	/* This is 'w' from the LC algorithm */
> + 	int			bucket_width;
> + 	int			array_no,
> + 				element_no;
> + 	Datum		hash_key;
> + 	TrackItem  *item;
> + 
> + 	int			lengths_count;
> + 	int			length_index;
> + 	int			slot_idx = 0;
> + 	HTAB	   *length_tab;
> + 	HASHCTL		length_hash_ctl;
> + 	LengthItem *length_item;
> + 	LengthItem *sorted_length_tab;
> + 
> + 	/*
> + 	 * Most part of array operators, which selectivity is estimated by this
> + 	 * statistics, takes care only one occurence of element in array. That's
> + 	 * why we should take care about count element occurence only once per
> + 	 * array. To clean occurence flag for each array by iterating over all
> + 	 * hash table can be too expensive. That's why we store pointers to hash
> + 	 * items contain elements which occur in last array.
> + 	 */
> + 	TrackItem **occurences = NULL;
> + 	int			occurences_count = 0,
> + 				occurence_index;
> + 
> + 	TypeCacheEntry *typentry;
> + 	Oid			hash_opclass,
> + 				hash_opfamily,
> + 				element_typeid,
> + 				hash_oroc;
> + 	FmgrInfo	hash_func_info;
> + 
> + 	StdAnalyzeData *mystats = (StdAnalyzeData *) stats->extra_data;
> + 
> + 	/* Compute standard statistics */
> + 	if (OidIsValid(mystats->ltopr))
> + 		compute_scalar_stats(stats, fetchfunc, samplerows, totalrows);
> + 	else
> + 		compute_minimal_stats(stats, fetchfunc, samplerows, totalrows);
> + 
> + 
> + 	/* Gathering all necessary information about element data type. */
> + 
> + 	element_typeid = stats->attrtype->typelem;
> + 
> + 	if (!OidIsValid(element_typeid))
> + 		elog(ERROR, "array_typanalyze was invoked with %d non-array type",
> + 			 stats->attrtypid);
> + 
> + 	typentry = lookup_type_cache(element_typeid, TYPECACHE_EQ_OPR |
> + 	 TYPECACHE_CMP_PROC | TYPECACHE_EQ_OPR_FINFO | TYPECACHE_CMP_PROC_FINFO);
> + 	ARRAY_ANALYZE_CHECK_OID(typentry->cmp_proc);
> + 	ARRAY_ANALYZE_CHECK_OID(typentry->eq_opr);
> + 
> + 	hash_opclass = GetDefaultOpClass(element_typeid, HASH_AM_OID);
> + 	ARRAY_ANALYZE_CHECK_OID(hash_opclass);
> + 
> + 	hash_opfamily = get_opclass_family(hash_opclass);
> + 	ARRAY_ANALYZE_CHECK_OID(hash_opfamily);
> + 
> + 	hash_oroc = get_opfamily_proc(hash_opfamily, element_typeid,
> + 								  element_typeid, HASHPROC);
> + 	ARRAY_ANALYZE_CHECK_OID(hash_oroc);
> + 
> + 	fmgr_info(hash_oroc, &hash_func_info);
> + 
> + 	InitFunctionCallInfoData(element_type_info.cmp, &typentry->cmp_proc_finfo,
> + 							 2, DEFAULT_COLLATION_OID, NULL, NULL);
> + 	InitFunctionCallInfoData(element_type_info.eq, &typentry->eq_opr_finfo,
> + 							 2, DEFAULT_COLLATION_OID, NULL, NULL);
> + 	InitFunctionCallInfoData(element_type_info.hash, &hash_func_info,
> + 							 1, DEFAULT_COLLATION_OID, NULL, NULL);
> + 	element_type_info.typbyval = typentry->typbyval;

As I mentioned above in passing, all of the above setup only needs to happen
once per analyzed column, not once per tuple.  It belongs in array_typanalyze;
define a struct to hold all the looked-up state, including a pointer to any
state for std_typanalyze(), and store that in stats->extra_data.

This code should probably get around to using the new SortSupport
infrastructure like analyze.c now uses.  Not sure whether that's mandatory for
initial commit.

typanalyze functions that call arbitrary user code must be reentrant.  It only
matters in corner cases, but crashing in those corner cases is not acceptable.
See our use of CurTupleHashTable in execGrouping.c and defend this global
state in a similar fashion.

> + 
> + 	/*
> + 	 * We want statistics_target * 10 elements in the MCELEM array. This
> + 	 * multiplier is pretty arbitrary, but is meant to reflect the fact that
> + 	 * the number of individual elements tracked in pg_statistic ought to be
> + 	 * more than the number of values for a simple scalar column.
> + 	 */
> + 	num_mcelem = stats->attr->attstattarget * 10;
> + 
> + 	/*
> + 	 * We set bucket width equal to (num_mcelem + 10) / 0.007 as per the
> + 	 * comment above.
> + 	 */
> + 	bucket_width = num_mcelem * 1000 / 7;

The addend mentioned is not present in the code or discussed in "the comment
above".  (I see the comment is copied verbatim from ts_typanalyze(), where the
addend *is* present, though again the preceding comment says nothing of it.)

> + 
> + 	/*
> + 	 * Create the hashtable. It will be in local memory, so we don't need to
> + 	 * worry about overflowing the initial size. Also we don't need to pay any
> + 	 * attention to locking and memory management.
> + 	 */
> + 	MemSet(&elem_hash_ctl, 0, sizeof(elem_hash_ctl));
> + 	elem_hash_ctl.keysize = sizeof(Datum);
> + 	elem_hash_ctl.entrysize = sizeof(TrackItem);
> + 	elem_hash_ctl.hash = element_hash;
> + 	elem_hash_ctl.match = element_match;
> + 	elem_hash_ctl.hcxt = CurrentMemoryContext;
> + 	elements_tab = hash_create("Analyzed elements table",
> + 							   bucket_width * 7,

Though it's copied from compute_tsvector_stats(), why "7"?

> + 							   &elem_hash_ctl,
> + 					HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT);
> + 
> + 	/*
> + 	 * hashtable for arrays lengths.
> + 	 */
> + 	MemSet(&length_hash_ctl, 0, sizeof(length_hash_ctl));
> + 	length_hash_ctl.keysize = sizeof(int);
> + 	length_hash_ctl.entrysize = sizeof(LengthItem);
> + 	length_hash_ctl.hash = tag_hash;

You need to pass the HASH_FUNCTION flag for this setting to take effect.

> + 	length_hash_ctl.match = memcmp;

Omit this.  You would need to pass HASH_COMPARE for it to take effect, but
it's also implicit once you override the hash function.

> + 	length_hash_ctl.hcxt = CurrentMemoryContext;
> + 	length_tab = hash_create("Array length table",
> + 							 64,
> + 							 &length_hash_ctl,
> + 							 HASH_ELEM | HASH_CONTEXT);
> + 
> + 	/* Initialize counters. */
> + 	b_current = 1;
> + 	element_no = 0;
> + 
> + 	/* Loop over the arrays. */
> + 	for (array_no = 0; array_no < samplerows; array_no++)
> + 	{
> + 		Datum		value;
> + 		bool		isnull;
> + 		bool		null_present;
> + 		ArrayType  *array;
> + 		char	   *ptr;
> + 		bits8	   *bitmap;
> + 		int			bitmask;
> + 		int			j;
> + 		int			ndims;
> + 		int		   *dims;
> + 		int			nitems;
> + 		bool		length_found;
> + 
> + 		vacuum_delay_point();
> + 
> + 		value = fetchfunc(stats, array_no, &isnull);
> + 
> + 		/*
> + 		 * Check for null/nonnull.
> + 		 */
> + 		if (isnull)
> + 		{
> + 			null_cnt++;
> + 			continue;
> + 		}
> + 
> + 		/*
> + 		 * Add up widths for average-width calculation.  Since it's a array,
> + 		 * we know it's varlena.  As in the regular compute_minimal_stats
> + 		 * function, we use the toasted width for this calculation.
> + 		 */
> + 		total_width += VARSIZE_ANY(DatumGetPointer(value));
> + 
> + 		/*
> + 		 * Now detoast the array if needed.
> + 		 */
> + 		array = DatumGetArrayTypeP(value);
> + 		ptr = ARR_DATA_PTR(array);
> + 		bitmap = ARR_NULLBITMAP(array);
> + 		bitmask = 1;
> + 		ndims = ARR_NDIM(array);
> + 		dims = ARR_DIMS(array);
> + 		nitems = ArrayGetNItems(ndims, dims);
> + 
> + 		/*
> + 		 * Check if we have enough of memory to store element occurences in
> + 		 * one array.
> + 		 */
> + 		if (nitems > occurences_count)
> + 		{
> + 			occurences_count = 2 * nitems;
> + 			if (occurences)
> + 				occurences = (TrackItem **) repalloc(occurences,
> + 									 sizeof(TrackItem *) * occurences_count);
> + 			else
> + 				occurences = (TrackItem **) palloc(
> + 									 sizeof(TrackItem *) * occurences_count);
> + 		}
> + 		occurence_index = 0;
> + 
> + 		null_present = false;
> + 
> + 		/*
> + 		 * We loop through the elements in the array and add them to our
> + 		 * tracking hashtable.	Note: the hashtable entries will point into
> + 		 * the (detoasted) array value, therefore we cannot free that storage
> + 		 * until we're done.
> + 		 */

The second sentence of this comment is obsolete.

> + 		for (j = 0; j < nitems; j++)
> + 		{
> + 			bool		found;
> + 			bool		isnull;
> + 
> + 			/* Get elements, checking for NULL */
> + 			if (bitmap && (*bitmap & bitmask) == 0)
> + 			{
> + 				hash_key = (Datum) 0;
> + 				isnull = true;
> + 				null_present = true;
> + 			}
> + 			else
> + 			{
> + 				/* Get element value */
> + 				hash_key = fetch_att(ptr, typentry->typbyval, typentry->typlen);
> + 				isnull = false;
> + 
> + 				/*
> + 				 * We should allocate memory for element if it isn't passing
> + 				 * by value, because array will be freed after that loop.
> + 				 */
> + 				if (!typentry->typbyval)
> + 				{
> + 					Datum		tmp;
> + 					int			length;
> + 
> + 					length = (typentry->typlen == -1) ?
> + 						VARSIZE(hash_key) : typentry->typlen;
> + 					tmp = (Datum) MemoryContextAlloc(stats->anl_context, length);
> + 					memcpy((void *) tmp, (void *) hash_key, length);
> + 					hash_key = tmp;
> + 				}

Please use datumCopy() or add a comment about why it's insufficient.

> + 				ptr = att_addlength_pointer(ptr, typentry->typlen, ptr);
> + 				ptr = (char *) att_align_nominal(ptr, typentry->typalign);
> + 			}
> + 
> + 			/* Advance bitmap pointers if any */
> + 			bitmask <<= 1;
> + 			if (bitmask == 0x100)
> + 			{
> + 				if (bitmap)
> + 					bitmap++;
> + 				bitmask = 1;
> + 			}
> + 
> + 			/* No null element processing other then flag setting here */
> + 			if (isnull)
> + 				continue;
> + 
> + 			/* Lookup current element in hashtable, adding it if new */
> + 			item = (TrackItem *) hash_search(elements_tab,
> + 											 (const void *) &hash_key,
> + 											 HASH_ENTER, &found);
> + 
> + 			if (found)
> + 			{
> + 				/*
> + 				 * The element is already on the tracking list. If it is the
> + 				 * first occurence in array then update element frequency.
> + 				 */
> + 				if (!item->occurence)
> + 				{
> + 					item->frequency++;
> + 					item->occurence = true;
> + 					occurences[occurence_index++] = item;
> + 				}
> + 			}
> + 			else
> + 			{
> + 				/* Initialize new tracking list element */
> + 				item->frequency = 1;
> + 				item->delta = b_current - 1;
> + 				item->occurence = true;
> + 				occurences[occurence_index++] = item;
> + 			}
> + 
> + 			/* element_no is the number of elements processed (ie N) */
> + 			element_no++;

We should not bump element_no when we skipped the element as a duplicate
within the same array.

> + 
> + 			/* We prune the D structure after processing each bucket */
> + 			if (element_no % bucket_width == 0)
> + 			{
> + 				prune_element_hashtable(elements_tab, b_current);
> + 				b_current++;
> + 			}
> + 		}
> + 		/* Count null element only once per array */
> + 		if (null_present)
> + 			null_elem_cnt++;
> + 
> + 		/* Update frequency of particular array length. */
> + 		length_item = (LengthItem *) hash_search(length_tab,
> + 												 &occurence_index,
> + 												 HASH_ENTER, &length_found);
> + 		if (length_found)
> + 		{
> + 			length_item->frequency++;
> + 		}
> + 		else
> + 		{
> + 			length_item->length = occurence_index;
> + 			length_item->frequency = 1;
> + 		}
> + 		total_length += occurence_index;
> + 
> + 		/*
> + 		 * When we end processing of particular array we should clean the
> + 		 * occurence flag.
> + 		 */
> + 		for (j = 0; j < occurence_index; j++)
> + 			occurences[j]->occurence = false;

Could you usefully simplify this away by storing the last-containing array_no,
instead of a bool, in each hash entry?

> + 
> + 		/* We should free memory from array if it was copied during detoast. */
> + 		if ((Datum) array != value)
> + 			pfree((void *) array);

No need to for casts to "void *".

> + 	}
> + 
> + 	/* Skip slots occupied by standard statistics */
> + 	while (OidIsValid(stats->stakind[slot_idx]))
> + 		slot_idx++;
> + 
> + 	/* Fill histogram of arrays lengths. */
> + 	lengths_count = hash_get_num_entries(length_tab);
> + 	if (lengths_count > 0)
> + 	{
> + 		int			num_hist = stats->attr->attstattarget;
> + 		int			delta;
> + 		int			frac;
> + 		int			i;
> + 		Datum	   *hist_values;
> + 
> + 		/* Copy lengths statistics from hashtab to array and sort them. */
> + 		length_index = 0;
> + 		sorted_length_tab = (LengthItem *) palloc(sizeof(LengthItem) * lengths_count);
> + 		hash_seq_init(&scan_status, length_tab);
> + 		while ((length_item = (LengthItem *) hash_seq_search(&scan_status)) != NULL)
> + 		{
> + 			memcpy(&sorted_length_tab[length_index], length_item,
> + 				   sizeof(LengthItem));
> + 			length_index++;
> + 		}
> + 		qsort(sorted_length_tab, lengths_count, sizeof(LengthItem),
> + 			  lengthitem_compare_element);
> + 
> + 		/* Histogram should be stored in anl_context. */
> + 		hist_values = (Datum *) MemoryContextAlloc(stats->anl_context,
> + 												   sizeof(Datum) * num_hist);
> + 		/* Fill histogram by hashtab. */
> + 		delta = samplerows - null_cnt - 1;
> + 		length_index = 0;
> + 		frac = sorted_length_tab[0].frequency * (num_hist - 1);
> + 		for (i = 0; i < num_hist; i++)
> + 		{
> + 			hist_values[i] =
> + 				Int32GetDatum(sorted_length_tab[length_index].length);
> + 			frac -= delta;
> + 			while (frac <= 0)
> + 			{
> + 				length_index++;
> + 				frac += sorted_length_tab[length_index].frequency *
> + 					(num_hist - 1);
> + 			}
> + 		}
> + 
> + 		stats->stakind[slot_idx] = STATISTIC_KIND_LENGTH_HISTOGRAM;
> + 		stats->staop[slot_idx] = Int4EqualOperator;
> + 		stats->stavalues[slot_idx] = hist_values;
> + 		stats->numvalues[slot_idx] = num_hist;
> + 		/* We are storing values of element type */
> + 		stats->statypid[slot_idx] = INT4OID;
> + 		stats->statyplen[slot_idx] = 4;
> + 		stats->statypbyval[slot_idx] = true;
> + 		stats->statypalign[slot_idx] = 'i';
> + 		slot_idx++;
> + 	}
> + 
> + 	/* We can only compute real stats if we found some non-null values. */
> + 	if (null_cnt < samplerows)
> + 	{
> + 		int			nonnull_cnt = samplerows - null_cnt;
> + 		int			i;
> + 		TrackItem **sort_table;
> + 		int			track_len;
> + 		int			cutoff_freq;
> + 		int			minfreq,
> + 					maxfreq;
> + 
> + 		stats->stats_valid = true;
> + 		/* Do the simple null-frac and average width stats */
> + 		stats->stanullfrac = (double) null_cnt / (double) samplerows;
> + 		stats->stawidth = total_width / (double) nonnull_cnt;

Isn't this redundant with the calculations made in compute_scalar_stats()?

> + 
> + 		/* Assume it's a unique column (see notes above) */
> + 		stats->stadistinct = -1.0;

The comment, copied from ts_typanalyze(), refers to notes not likewise copied.
We should probably instead leave whatever compute_scalar_stats() calculated.

> *** a/src/backend/utils/adt/selfuncs.c
> --- b/src/backend/utils/adt/selfuncs.c
> ***************
> *** 1705,1710 **** scalararraysel(PlannerInfo *root,
> --- 1705,1736 ----
>   	RegProcedure oprsel;
>   	FmgrInfo	oprselproc;
>   	Selectivity s1;
> + 	bool		varonleft;
> + 	Node	   *other;
> + 	VariableStatData vardata;
> + 	
> + 	/*
> + 	 * Handle "const = qual(column)" case using array column statistics.
> + 	 */
> + 	if (get_restriction_variable(root, clause->args, varRelid,
> + 								  &vardata, &other, &varonleft))
> + 	{
> + 		Oid elemtype;
> + 		elemtype = get_base_element_type(vardata.vartype);
> + 		if (elemtype != InvalidOid && IsA(other, Const))
> + 		{
> + 			if (((Const *) other)->constisnull)
> + 			{
> + 				/* qual can't succeed if null array */
> + 				ReleaseVariableStats(vardata);
> + 				return (Selectivity) 0.0;
> + 			}
> + 			s1 = calc_scalararraysel(&vardata, ((Const *) other)->constvalue, useOr);
> + 			ReleaseVariableStats(vardata);
> + 			return s1;
> + 		}
> + 		ReleaseVariableStats(vardata);
> + 	}

If we're going to add a new file for array selectivity functions (which seems
reasonable), scalararraysel() should also move there.  (To do this and still
keep the patch easy to read, you could do the move in a pre-patch.)

> *** a/src/include/catalog/pg_proc.h
> --- b/src/include/catalog/pg_proc.h
> ***************
> *** 849,854 **** DATA(insert OID = 2334 (  array_agg_finalfn   PGNSP PGUID 12 1 0 0 0 f f f f f i
> --- 849,859 ----
>   DESCR("aggregate final function");
>   DATA(insert OID = 2335 (  array_agg		   PGNSP PGUID 12 1 0 0 0 t f f f f i 1 0 2277 "2283" _null_ _null_ _null_ _null_ aggregate_dummy _null_ _null_ _null_ ));
>   DESCR("concatenate aggregate input into an array");
> + DATA(insert OID = 3816 (  array_typanalyze PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 16 "2281" _null_ _null_ _null_ _null_ array_typanalyze _null_ _null_ _null_ ));
> + DESCR("array statistics collector");
> + #define ArrayTypanalyzeOid 3816

Use the fmgroids.h symbol, F_ARRAY_TYPANALYZE, instead.

> + DATA(insert OID = 3817 (  arraysel		   PGNSP PGUID 12 1 0 0 0 f f f t f s 4 0 701 "2281 26 2281 23" _null_ _null_ _null_ _null_ arraysel _null_ _null_ _null_ ));
> + DESCR("array selectivity estimation functions");
>   
>   DATA(insert OID = 760 (  smgrin			   PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 210 "2275" _null_ _null_ _null_ _null_	smgrin _null_ _null_ _null_ ));
>   DESCR("I/O");
> *** a/src/include/catalog/pg_statistic.h
> --- b/src/include/catalog/pg_statistic.h

>   /*
>    * Currently, three statistical slot "kinds" are defined: most common values,

Here's a larger quote of the comment starting here:

/*
 * Currently, three statistical slot "kinds" are defined: most common values,
 * histogram, and correlation.	Additional "kinds" will probably appear in
 * future to help cope with non-scalar datatypes.  Also, custom data types
 * can define their own "kind" codes by mutual agreement between a custom
 * typanalyze routine and the selectivity estimation functions of the type's
 * operators.

That needs an update.  (It already needs an update for STATISTIC_KIND_MCELEM,
but this patch would invalidate it yet again.)

> ***************
> *** 260,263 **** typedef FormData_pg_statistic *Form_pg_statistic;
> --- 268,274 ----
>    */
>   #define STATISTIC_KIND_MCELEM  4
>   
> + 
> + #define STATISTIC_KIND_LENGTH_HISTOGRAM  5

The other kinds have long explanatory comments; this one should, too.

> *** a/src/include/catalog/pg_type.h
> --- b/src/include/catalog/pg_type.h

> ! DATA(insert OID = 1021 (  _float4	 PGNSP PGUID -1 f b A f t \054 0 700 0 array_in array_out array_recv array_send - - array_typanalyze i x f 0 -1 0 0 _null_ _null_ ));

With this patch, a fresh database sets typanalyze = array_typanalyze for 27
array types and leaves typanalyze = NULL for the other 38 array types.  What
is the rationale for the split?  For example, why does real[] use the new
typanalyze but numeric[] does not?

Thanks,
nm

In response to

Responses

pgsql-hackers by date

Next:From: Kevin GrittnerDate: 2011-12-29 16:44:47
Subject: Re: 16-bit page checksums for 9.2
Previous:From: Alvaro HerreraDate: 2011-12-29 13:22:39
Subject: Re: contrib/README

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group