Re: Add 64-bit XIDs into PostgreSQL 15

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Ilya Anfimov <ilan(at)tzirechnoy(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add 64-bit XIDs into PostgreSQL 15
Date: 2022-02-02 15:10:23
Message-ID: CALT9ZEEsj54k9+xmcSAYLe7YsEHYbzFUuDwpVzkjmR6CCPHpww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Andres!

I've revised the README a little bit to address your corrections and
questions. Thanks for this very much!
A patchset with changed README is attached as v8 here (the code is
unchanged and identical to v7).

> > +The downside of this is that we can not use tuple's XMIN and XMAX right
> away.
> > +We often need to re-read t_xmin and t_xmax - which could actually be
> pointers
> > +into a page in shared buffers and therefore they could be updated by
> any other
> > +backend.
>
> Ugh, that's not great.
>
Agree. This part is one of the candidates for revision as per proposals
above [1] i.e :
"2A. Probably refactor it to store precalculated XMIN/XMAX in memory
tuple representation instead of t_xid_base/t_multi_base".

We are working on this change.

> What happens if the first access happens on a replica?
>
> What is the approach for dealing with multixact files? They have xids
> embedded? And currently the SLRUs will break if you just let the offsets
> SLRU
> grow without bounds.
>
> Wait. So you just modify the page without WAL logging or marking it dirty
> on a
> standby? I fail to see how that can be correct.
>
> Imagine the cluster is promoted, the page is dirtied, and we write it
> out. You'll have written out a completely changed page, without any WAL
> logging. There's plenty other scenarios.
>
In this part, I suppose you've found a definite bug. Thanks! There are a
couple
of ways how it could be fixed:

1. If we enforce checkpoint at replica promotion then we force full-page
writes after each page modification afterward.

2. Maybe it's worth using BufferDesc bit to mark the page as converted to
64xid but not yet written to disk? For example, one of four bits from
BUF_USAGECOUNT.
BM_MAX_USAGE_COUNT = 5 so it will be enough 3 bits to store it. This will
change in-memory page representation but will not need WAL-logging which is
impossible on a replica.

What do you think about it?

[1]
https://www.postgresql.org/message-id/CALT9ZEHy9yFQEwptCUznPLciqM9ZSs91yTnNSSiG22m%3DBgCpNA%40mail.gmail.com

Attachment Content-Type Size
v8-0003-README.XID64.patch application/octet-stream 7.0 KB
v8-0001-64-bit-GUCs.patch application/octet-stream 25.6 KB
v8-0002-Add-64bit-xid.patch application/octet-stream 735.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-02-02 15:14:15 Re: Server-side base backup: why superuser, not pg_write_server_files?
Previous Message Tom Lane 2022-02-02 15:01:28 Re: Ensure that STDERR is empty during connect_ok