| From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
|---|---|
| To: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
| Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Evgeny Voropaev <evgeny(dot)voropaev(at)tantorlabs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
| Subject: | Re: Add 64-bit XIDs into PostgreSQL 15 |
| Date: | 2026-05-04 10:16:10 |
| Message-ID: | CAPpHfdsxPph68qJ3ddth=PyZV=9S_6j+hPrgAn+KNw6v1NVoCw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi!
On Fri, Feb 27, 2026 at 4:22 PM Maxim Orlov <orlovmg(at)gmail(dot)com> wrote:
>
> I just wanted to share my work-in-progress on switching the clog to 64
> bits. So far, I've only dealt with the clog itself. Right now, support
> for the x86 platform is lacking. But the first two patches are
> refactorings and can be committed promptly.
Based on the discussion upthread (huge thanks to everybody involved!),
we can figure out a plan to rework patches to be present for
PostgreSQL 20.
1. Keep `TransactionId` as 32-bit; use `FullTransactionId` only
locally where epoch matters.
2. Keep the existing limit of 2^31 distance between running
transactions; do not lift it as part of the patches.
3. Store compact 32-bit base values on the heap page special area.
Calculate as following xid64 = (base << 31) + xid32. This effectively
limits 64-bit transaction ids to 63 bits. But, as shown, this
shouldn't be a limit, since our xlog pointers are 64-bit.
4. Eliminate the "double xmax" format; use a `pd_flags` bit and
overlay base values onto existing page header fields when there is no
space for a special area.
5. Split the work into the smallest possible self-contained patches;
keep mechanical refactors strictly separate from substantive changes.
6. Make the first useful step as "page-level epoch + lazy freeze",
before touching clog.
7. Do the language cleanup: "micro-vacuum" => "HOT pruning", "buffers
cutting" => "SLRU truncation", "vacuum freeze required" => "aggressive
vacuum", etc., as discussed upthread.
I can propose dividing the patchset into the following phases (each
phase can contain multiple patches).
Phase 0: Mechanical refactors
Implement accessors renames (HeapTupleHeaderGetXmin =>
HeapTupleGetXmin, HeapTupleHeaderGetXmax => HeapTupleGetXmax, ...),
and their signature changes in separate atomic patches to make review
as easy as possible.
Phase 1: Page-level XID and MXID base
Introduce an 8-byte heap-page special area for 32-bit pd_xid_base and
32-bit pd_multi_base. Make xmin/xmax accessors calculate 64-bit xid
using (pd_xid_base/pd_multi_base) << 31 + xid/multixact, then convert
it back to a 32-bit value according to the current epoch. For
transactions less than relfrozenxid return FrozenTransactionId.
Implement heap_page_prepare_for_xid() / heap_page_prepare_for_multi():
shift base when inserting a new xid/multixact to the page if needed.
Use these functions before inserting a new xid/multixact to the page.
Phase 2: Lazy freeze win
During aggressive vacuum, do not override old xids to
FrozenTransactionId. We will automatically receive those values after
phase 1 for xids below our frozen horizon. Correspondingly, for a page
with no dead tuples, do not dirty that page. Update `heap_prune_chain`
and `lazy_scan_prune` to skip dirtying when no actual change is
needed.
Phase 3: Overlay format + page format conversion
Introduce a new PD_HAS_NO_SPECIAL bit in pd_flags. When set:
- pd_prune_xid (4 bytes) holds pd_xid_base;
- the (pd_pagesize_version, pd_special) pair (4 bytes) holds pd_multi_base;
- the values to use instead of pd_pagesize_version and pd_special
are calculated from constants.
When reading the base xid/multixact base values, first check the
PD_HAS_NO_SPECIAL flag to find the correct location of base values.
When PD_HAS_NO_SPECIAL is not specified, but no special area is
allocated on the page, the page must be converted. The conversion
point should not be in the buffer manager, but inside the heapam.
Phase 4: 64-bit SLRUs
Extend the clog and multixact-offsets SLRUs to 64-bit indexing.
pg_upgrade handling for existing clog segments (rename/convert).
Convert HeapTupleHeaderGetXmin()/HeapTupleHeaderGetXmax() to return
FullTransactionId. Use FullTransactionId for lookups to
clong/multixact-offsets. Convert it to TransactionId for visibility
checking.
Phase 5: Drop anti-wraparound vacuums
Aggressive vacuum is now needed only for SLRU truncation, not to
prevent wraparound. Remove wraparound-emergency code paths. Revise the
documentation to clarify the new purpose of aggressive vacuum.
Also, it's essential for all of above to come with a set of scripts
exercising pg_upgrade in quite comprehensive set of scenarios.
------
Regards,
Alexander Korotkov
Supabase
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jim Jones | 2026-05-04 11:24:03 | Re: ALTER TABLE: warn when actions do not recurse to partitions |
| Previous Message | Antonin Houska | 2026-05-04 09:52:13 | Re: Report index currently being vacuumed in pg_stat_progress_vacuum |