Re: Statistics Import and Export

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Statistics Import and Export
Date: 2024-03-01 03:55:20
Message-ID: CADkLM=dUGyQMK8Rhz1gVLeW8u2nBWOLvH7-j82MAAN_PmcSAQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
> That’s certainly a fair point and my initial reaction (which could
> certainly be wrong) is that it’s unlikely to be an issue- but also, if you
> feel you could make it work with an array and passing all the attribute
> info in with one call, which I suspect would be possible but just a bit
> more complex to build, then sure, go for it. If it ends up being overly
> unwieldy then perhaps the per-attribute call would be better and we could
> perhaps acquire the lock before the function calls..? Doing a check to see
> if we have already locked it would be cheaper than trying to acquire a new
> lock, I’m fairly sure.
>

Well the do_analyze() code was already ok with acquiring the lock once for
non-inherited stats and again for inherited stats, so the locks were
already not the end of the world. However, that's at most a 2x of the
locking required, and this would natts * x, quite a bit more. Having the
procedures check for a pre-existing lock seems like a good compromise.

> Also per our prior discussion- this makes sense to include in post-data
> section, imv, and also because then we have the indexes we may wish to load
> stats for, but further that also means it’ll be in the paralleliziable part
> of the process, making me a bit less concerned overall about the individual
> timing.
>

The ability to parallelize is pretty persuasive. But is that per-statement
parallelization or do we get transaction blocks? i.e. if we ended up
importing stats like this:

BEGIN;
LOCK TABLE schema.relation IN SHARE UPDATE EXCLUSIVE MODE;
LOCK TABLE pg_catalog.pg_statistic IN ROW UPDATE EXCLUSIVE MODE;
SELECT pg_import_rel_stats('schema.relation', ntuples, npages);
SELECT pg_import_pg_statistic('schema.relation', 'id', ...);
SELECT pg_import_pg_statistic('schema.relation', 'name', ...);
SELECT pg_import_pg_statistic('schema.relation', 'description', ...);
...
COMMIT;

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-03-01 04:08:15 Re: Improve readability by using designated initializers when possible
Previous Message Zhijie Hou (Fujitsu) 2024-03-01 03:42:23 RE: Synchronizing slots from primary to standby