Re: [HACKERS] Challenges preventing us moving to 64 bit transaction id (XID)?

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tianzhou Chen <tianzhouchen(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Challenges preventing us moving to 64 bit transaction id (XID)?
Date: 2017-11-27 21:41:19
Message-ID: CAPpHfdvwOU=H_8uxok-yp7z9+gz4kibq61ipD+oRernnRK3pVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 27, 2017 at 10:56 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Nov 24, 2017 at 5:33 AM, Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > pg_prune_xid makes sense only for heap pages. Once we introduce special
> > area for heap pages, we can move pg_prune_xid there and save some bytes
> in
> > index pages. However, this is an optimization not directly related to
> > 64-bit xids. Idea is that if we anyway change page format, why don't
> apply
> > this optimization as well? But if we have any doubts in this, it can be
> > removed with no problem.
>
> My first reaction is that changing the page format seems like a
> non-starter, because it would break pg_upgrade. If we get the heap
> storage API working, then we could have a heap AM that works as it
> does today and a newheap AM with such changes, but I have a bit of a
> hard time imagining a patch that causes a hard compatibility break
> ever being accepted.

Thank you for raising this question. There was a discussion about 64-bit
xids during PGCon 2017. Couple ways to provide pg_upgrade were discussed.

1) We've page layout version in the page (current is number 4). So, we can
define new page layout version 5. Pages with new layout version would
contain 64-bit base values for xid and multixact. The question is how to
deal with page of layout version 4. If this page have enough of free space
to fit extra 16 bytes, then it could be upgraded on the fly. If it doesn't
contains enough of space for than then things becomes more complicated: we
can't upgrade it to new format, but we still need to fit new xmax value
there in the case tuple being updated or deleted. pg_upgrade requires
server restart. Thus, once we set hint bits, pre-pg_upgrade xmin is not
really meaningful – corresponding xid is visible for every post-pg_upgrade
snapshot. So, idea is to use both xmin and xmax tuple fields on such
unupgradable page to store 64-bit xmax. This idea was proposed by me, but
was criticized by some session attendees (sorry, but I don't recall who
were them) for its complexity and suspected overhead.

2) Alternative idea was to use unused bits in page header. Naturally, if
we would look for unused bits in pd_flags (3 bits of 16 is
used), pd_pagesize_version (we can left 1 bit of 16 to distinguish between
old and new format) and pd_special (we can leave 1 bit to distinguish
sequence pages), we can scrape together 43 bits. That would be far enough
for single base value, because we definitely don't need all lower 32-bits
of base value (21 bits is more than enough). But I'm not sure about two
base values: if we would live 2 bits for lower part of base value, than it
leaves us 19 bits for high part of base value. This solution would give us
2^51 maximum values for xids and multixacts. I'm not sure if it's enough
to assume these counters infinite. AFAIK, there are products on the market
whose have 48-bit transaction identifiers and don't care about wraparound
or something...

New heap AM for 64-bit xids is an interesting idea too. I would even say
that pluggable storage API being discussed now is excessive for this
particular purpose (but still can fit!), because in most of aspects heap
with 64-bit xids is absolutely same as current heap (in contrast to heap
with undo log, for example). Best fit API for heap with 64-bit xid support
would be pluggable heap page format. But I don't think it deserves
separate API though.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2017-11-27 22:48:54 Re: ERROR: too many dynamic shared memory segments
Previous Message Masahiko Sawada 2017-11-27 21:34:31 Re: [HACKERS] Transactions involving multiple postgres foreign servers