pgsql: Collect and use element-frequency statistics for arrays.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Collect and use element-frequency statistics for arrays.
Date: 2012-03-04 01:21:19
Message-ID: E1S408V-0008Ng-A5@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Collect and use element-frequency statistics for arrays.

This patch improves selectivity estimation for the array <@, &&, and @>
(containment and overlaps) operators. It enables collection of statistics
about individual array element values by ANALYZE, and introduces
operator-specific estimators that use these stats. In addition,
ScalarArrayOpExpr constructs of the forms "const = ANY/ALL (array_column)"
and "const <> ANY/ALL (array_column)" are estimated by treating them as
variants of the containment operators.

Since we still collect scalar-style stats about the array values as a
whole, the pg_stats view is expanded to show both these stats and the
array-style stats in separate columns. This creates an incompatible change
in how stats for tsvector columns are displayed in pg_stats: the stats
about lexemes are now displayed in the array-related columns instead of the
original scalar-related columns.

There are a few loose ends here, notably that it'd be nice to be able to
suppress either the scalar-style stats or the array-element stats for
columns for which they're not useful. But the patch is in good enough
shape to commit for wider testing.

Alexander Korotkov, reviewed by Noah Misch and Nathan Boley

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/0e5e167aaea4ceb355a6e20eec96c4f7d05527ab

Modified Files
--------------
doc/src/sgml/catalogs.sgml | 51 +-
src/backend/catalog/heap.c | 2 +-
src/backend/catalog/system_views.sql | 43 +-
src/backend/commands/analyze.c | 12 +-
src/backend/commands/typecmds.c | 6 +-
src/backend/tsearch/ts_selfuncs.c | 4 +
src/backend/tsearch/ts_typanalyze.c | 5 +
src/backend/utils/adt/Makefile | 3 +-
src/backend/utils/adt/array_selfuncs.c | 1225 +++++++++++++++++++++++++++++
src/backend/utils/adt/array_typanalyze.c | 762 ++++++++++++++++++
src/backend/utils/adt/selfuncs.c | 58 ++-
src/include/catalog/catversion.h | 2 +-
src/include/catalog/pg_operator.h | 9 +-
src/include/catalog/pg_proc.h | 6 +
src/include/catalog/pg_statistic.h | 96 ++--
src/include/catalog/pg_type.h | 132 ++--
src/include/commands/vacuum.h | 11 +-
src/include/utils/array.h | 5 +
src/include/utils/selfuncs.h | 14 +-
src/test/regress/expected/arrays.out | 1 +
src/test/regress/expected/rules.out | 2 +-
src/test/regress/expected/type_sanity.out | 33 +
src/test/regress/sql/arrays.sql | 2 +
src/test/regress/sql/type_sanity.sql | 25 +
24 files changed, 2341 insertions(+), 168 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Magnus Hagander 2012-03-04 11:27:03 pgsql: More carefully validate xlog location string inputs
Previous Message Andrew Dunstan 2012-03-03 21:47:14 Re: pgsql: Provide environment overrides for psql file locations.