Re: [HACKERS] Challenges preventing us moving to 64 bit transaction id (XID)?

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tianzhou Chen <tianzhouchen(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Challenges preventing us moving to 64 bit transaction id (XID)?
Date: 2017-11-23 13:39:33
Message-ID: CAA4eK1+di2+Nc8-=jZqdDLPzvYw-S-QSDcG6e1nOU3f9-07M0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 22, 2017 at 9:06 PM, Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> On Wed, Jun 7, 2017 at 11:33 AM, Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>>
>> On Tue, Jun 6, 2017 at 4:05 PM, Peter Eisentraut
>> <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>>>
>>> On 6/6/17 08:29, Bruce Momjian wrote:
>>> > On Tue, Jun 6, 2017 at 06:00:54PM +0800, Craig Ringer wrote:
>>> >> Tom's point is, I think, that we'll want to stay pg_upgrade
>>> >> compatible. So when we see a pg10 tuple and want to add a new page
>>> >> with a new page header that has an epoch, but the whole page is full
>>> >> so there isn't 32 bits left to move tuples "down" the page, what do we
>>> >> do?
>>> >
>>> > I guess I am missing something. If you see an old page version number,
>>> > you know none of the tuples are from running transactions so you can
>>> > just freeze them all, after consulting the pg_clog. What am I missing?
>>> > If the page is full, why are you trying to add to the page?
>>>
>>> The problem is if you want to delete from such a page. Then you need to
>>> update the tuple's xmax and stick the new xid epoch somewhere.
>>>
>>> We had an unconference session at PGCon about this. These issues were
>>> all discussed and some ideas were thrown around. We can expect a patch
>>> to appear soon, I think.
>>
>>
>> Right. I'm now working on splitting my large patch for 64-bit xids into
>> patchset.
>> I'm planning to post patchset in the beginning of next week.
>
>
> Work on this patch took longer than I expected. It is still in not so good
> shape, but I decided to publish it anyway in order to not stop progress in
> this area.
> I also tried to split this patch into several. But actually I manage to
> separate few small pieces, while most of changes are remaining in the single
> big diff.
> Long story short, patchset is attached.
>
> 0001-64bit-guc-relopt-1.patch
> This patch implements 64 bit GUCs and relation options which are used in
> further patches.
>
> 0002-heap-page-special-1.patch
> Putting xid and multixact bases into PageHeaderData would take extra 16
> bytes on index pages too. That would be waste of space for indexes. This
> is why I decided to put bases into special area of heap pages.
> This patch adds special area for heap pages contaning prune xid and magic
> number. Magic number is different for regular heap page and sequence page.
>

uint16 pd_pagesize_version;
- TransactionId pd_prune_xid; /* oldest prunable XID, or zero if none */
ItemIdData pd_linp[FLEXIBLE_ARRAY_MEMBER]; /* line pointer array */
} PageHeaderData;

Why have you moved pd_prune_xid from page header?

> 0003-64bit-xid-1.patch
> It's the major patch. It redefines TransactionID ad 64-bit integer and
> defines 32-bit ShortTransactionID which is used for t_xmin and t_xmax.
> Transaction id comparison becomes straight instead of circular. Base values
> for xids and multixact ids are stored in heap page special. SLRUs also
> became 64-bit and non-circular. To be able to calculate xmin/xmax without
> accessing heap page, base values are copied into HeapTuple. Correspondingly
> HeapTupleHeader(Get|Set)(Xmin|Xmax) becomes just
> HeapTuple(Get|Set)(Xmin|Xmax) whose require HeapTuple not just
> HeapTupleHeader. heap_page_prepare_for_xid() is used to ensure that given
> xid fits particular page base. If it doesn't fit then base of page is
> shifted, that could require single-page freeze. Format for wal is changed
> in order to prevent unaligned access to TransactionId. *_age GUCs and
> relation options are changed to 64-bit. Forced "autovacuum to prevent
> wraparound" is removed, but there is still freeze to truncate SLRUs.
>

It seems there is no README or some detailed explanation of how all
this works like how the value of pd_xid_base is maintained. I don't
think there are enough comments in the patch to explain the things. I
think it will be easier to understand and review the patch if you
provide some more details either in email or in the patch.

> 0004-base-values-for-testing-1.patch
> This patch is used for testing that calculations using 64-bit bases and
> short 32-bit xid values are correct. It provides initdb options for initial
> xid, multixact id and multixact offset values. Regression tests initialize
> cluster with large (more than 2^32) values.
>
> There are a lot of open items, but I would like to notice some of them:
> * WAL becomes significantly larger due to storage 8 byte xids instead of 4
> byte xids. Probably, its needed to use base approach in WAL too.
>

Yeah and I think it can impact performance as well. By any chance
have you run pgbench read-write to see the performance impact of this
patch?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-11-23 14:08:27 Re: PostgreSLQ v10.1 and xlC compiler on AIX
Previous Message Michael Paquier 2017-11-23 13:08:29 Re: PostgreSLQ v10.1 and xlC compiler on AIX