| From: | Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com> |
|---|---|
| To: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
| Cc: | Hannu Krosing <hannuk(at)google(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipkumarb(at)google(dot)com>, Antonin Houska <ah(at)cybertec(dot)at> |
| Subject: | Re: Patch: VACUUM should ignore (CREATE |RE)INDEX CONCURRENTLY for xmin horizon calculations |
| Date: | 2025-11-27 02:30:00 |
| Message-ID: | CADzfLwXN4NXv8C+8GzbMJvRaBkJMs838c92CM-6Js-=Wpi5aRQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello!
> If you're interested in improving CIC and reducing its impact on
> visibility horizons, you may be interested in reviewing Mihail's work
> in [0].
Actually I reduced the scope of the patch to only
single-heap-scan-STIR-based-CIC. But patch files are still available.
I did it because it looks like we need first a solution which also
will work for REPACK concurrently, see [1].
Reset-snapshot technique may still be applied later though (to reduce
bloat in the target table).
> In that case VACUUM and prune operations would have to build
> visibility horizons for each table, and I don't think that's viable;
> especially so in a heavily partitioned workload. It'd also be
> prohibitively expensive to add a per-table Xmin -- we don't always
> know which tables will be accessed by any backend until they lock that
> table, but at that time they may already have a snapshot that they
> need tuples from. If we lazily included that backend, there may be
> visibility horizons and prune operations that were built and executed
> ahead of them notifying the other backends of their use, and pruned
> away still-visible tuples like in the aforementioned CIC hot pruning
> issue.
I think I know the way to implement it with almost zero cost for a
"normal" situation.
It is based on Antonin's idea to change
GetOldestNonRemovableTransactionId(rel) because we have Relation as an
argument.
We may put some "pinned" xmin into Relation and fill it only while
filling RelCache by accessing some shared memory.
Of course there is a lot of things we need to worry about:
* make sure every backend will respect it (invalidate cache and wait
for each backend to "know" about it)
* make sure horizon moved only forward
* handle all kind of error of backend who "pinned" horizon
* etc
In a few words, protocol of "pinning" is next:
* get a "normal snapshot"
* put its xmin into shared memory as newest horizon value possible for
a particular relation
* invalidate RelCache for that relation
* wait for all transaction having any locks on that relation to commit
(and, as result, drop old cached Relation)
* we a free to release snapshot and set some PROC flag to ignore our
xmin for data (not catalog) horizon
* GetOldestNonRemovableTransactionId(rel) will return
TransactionIdOlder(actual_horizon, pinned) taking "pinned" from `rel`.
If someone is interested - some dirty version is available at [1], but
it is better to wait for a more-less polished patchset.
[0]: https://www.postgresql.org/message-id/flat/202510301734.pj4uds3mqxx4%40alvherre.pgsql#fd20662912580a89b860790f9729aaff
[1]: https://github.com/michail-nikolaev/postgres/commits/make-PROC_IN_SAFE_IC-great-again/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2025-11-27 02:46:10 | Re: Allow GUC settings in CREATE SUBSCRIPTION CONNECTION to take effect |
| Previous Message | Mihail Nikalayeu | 2025-11-27 01:54:00 | Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY |