Statistics Import and Export

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Statistics Import and Export
Date: 2023-08-31 06:47:31
Message-ID: CADkLM=cB0rF3p_FuWRTMSV0983ihTRpsH+OCpNyiqE7Wk0vUWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

pg_stats_export is a view that aggregates pg_statistic data by relation
oid and stores all of the column statistical data in a system-indepdent
(i.e.
no oids, collation information removed, all MCV values rendered as text)
jsonb format, along with the relation's relname, reltuples, and relpages
from pg_class, as well as the schemaname from pg_namespace.

pg_import_rel_stats is a function which takes a relation oid,
server_version_num, num_tuples, num_pages, and a column_stats jsonb in
a format matching that of pg_stats_export, and applies that data to
the specified pg_class and pg_statistics rows for the relation
specified.

The most common use-case for such a function is in upgrades and
dump/restore, wherein the upgrade process would capture the output of
pg_stats_export into a regular table, perform the upgrade, and then
join that data to the existing pg_class rows, updating statistics to be
a close approximation of what they were just prior to the upgrade. The
hope is that these statistics are better than the early stages of
--analyze-in-stages and can be applied faster, thus reducing system
downtime.

The values applied to pg_class are done inline, which is to say
non-transactionally. The values applied to pg_statitics are applied
transactionally, as if an ANALYZE operation was reading from a
cheat-sheet.

This function and view will need to be followed up with corresponding
ones for pg_stastitic_ext and pg_stastitic_ext_data, and while we would
likely never backport the import functions, we can have user programs
do the same work as the export views such that statistics can be brought
forward from versions as far back as there is jsonb to store it.

While the primary purpose of the import function(s) are to reduce downtime
during an upgrade, it is not hard to see that they could also be used to
facilitate tuning and development operations, asking questions like "how
might
this query plan change if this table has 1000x rows in it?", without
actually
putting those rows into the table.

Attachment Content-Type Size
v1-0001-Introduce-the-system-view-pg_stats_export-and-the.patch text/x-patch 39.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2023-08-31 06:55:17 Re: persist logical slots to disk during shutdown checkpoint
Previous Message Amit Kapila 2023-08-31 06:39:53 Re: persist logical slots to disk during shutdown checkpoint