Re: Statistics Import and Export

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Statistics Import and Export
Date: 2024-03-06 10:06:39
Message-ID: CAEze2WgdmG67-9wA2TeQwnV6QieYMVreXob84VT5GfuHOi77yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 1 Mar 2024, 04:55 Corey Huinker, <corey(dot)huinker(at)gmail(dot)com> wrote:
>> Also per our prior discussion- this makes sense to include in post-data section, imv, and also because then we have the indexes we may wish to load stats for, but further that also means it’ll be in the paralleliziable part of the process, making me a bit less concerned overall about the individual timing.
>
>
> The ability to parallelize is pretty persuasive. But is that per-statement parallelization or do we get transaction blocks? i.e. if we ended up importing stats like this:
>
> BEGIN;
> LOCK TABLE schema.relation IN SHARE UPDATE EXCLUSIVE MODE;
> LOCK TABLE pg_catalog.pg_statistic IN ROW UPDATE EXCLUSIVE MODE;
> SELECT pg_import_rel_stats('schema.relation', ntuples, npages);
> SELECT pg_import_pg_statistic('schema.relation', 'id', ...);
> SELECT pg_import_pg_statistic('schema.relation', 'name', ...);

How well would this simplify to the following:

SELECT pg_import_statistic('schema.relation', attname, ...)
FROM (VALUES ('id', ...), ...) AS relation_stats (attname, ...);

Or even just one VALUES for the whole statistics loading?

I suspect the main issue with combining this into one statement
(transaction) is that failure to load one column's statistics implies
you'll have to redo all the other statistics (or fail to load the
statistics at all), which may be problematic at the scale of thousands
of relations with tens of columns each.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2024-03-06 10:25:29 Re: Regarding the order of the header file includes
Previous Message Hayato Kuroda (Fujitsu) 2024-03-06 10:02:06 RE: speed up a logical replica setup