Re: [UNVERIFIED SENDER] Re: pg_upgrade can result in early wraparound on databases with high transaction load

From: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Jason Harvey <jason(at)reddit(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, <pgsql-hackers(at)postgresql(dot)org>, "Tharakan, Robins" <tharar(at)amazon(dot)com>
Subject: Re: [UNVERIFIED SENDER] Re: pg_upgrade can result in early wraparound on databases with high transaction load
Date: 2021-05-18 11:26:38
Message-ID: 78bf65ea-517a-51ed-947b-9728a6ace513@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hi,

On 5/4/21 10:17 AM, Drouvot, Bertrand wrote:
>
> Hi,
>
> On 4/24/21 3:00 AM, Andres Freund wrote:
>> Hi,
>>
>> On 2021-04-23 19:28:27 -0500, Justin Pryzby wrote:
>>> This (combination of) thread(s) seems relevant.
>>>
>>> Subject: pg_upgrade failing for 200+ million Large Objects
>>> https://www.postgresql.org/message-id/flat/12601596dbbc4c01b86b4ac4d2bd4d48%40EX13D05UWC001.ant.amazon.com
>>> https://www.postgresql.org/message-id/flat/a9f9376f1c3343a6bb319dce294e20ac%40EX13D05UWC001.ant.amazon.com
>>> https://www.postgresql.org/message-id/flat/cc089cc3-fc43-9904-fdba-d830d8222145%40enterprisedb.com#3eec85391c6076a4913e96a86fece75e
>> Huh. Thanks for digging these up.
>>
>>
>>>> Allows the user to provide a constant via pg_upgrade command-line, that
>>>> overrides the 2 billion constant in pg_resetxlog [1] thereby increasing the
>>>> (window of) Transaction IDs available for pg_upgrade to complete.
>> That seems the entirely the wrong approach to me, buying further into
>> the broken idea of inventing random wrong values for oldestXid.
>>
>> We drive important things like the emergency xid limits off oldestXid. On
>> databases with tables that are older than ~147million xids (i.e. not even
>> affected by the default autovacuum_freeze_max_age) the current constant leads
>> to setting the oldestXid to a value *in the future*/wrapped around. Any
>> different different constant (or pg_upgrade parameter) will do that too in
>> other scenarios.
>>
>> As far as I can tell there is precisely *no* correct behaviour here other than
>> exactly copying the oldestXid limit from the source database.
>>
> Please find attached a patch proposal doing so: it adds a new (- u)
> parameter to pg_resetwal that allows to specify the oldest unfrozen
> XID to set.
> Then this new parameter is being used in pg_upgrade to copy the source
> Latest checkpoint's oldestXID.
>
> Questions:
>
> * Should we keep the old behavior in case -x is being used without
> -u? (The proposed patch does not set an arbitrary oldestXID
> anymore in case -x is used.)
> * Also shouldn't we ensure that the xid provided with -x or -u is >=
> FirstNormalTransactionId (Currently the only check is that it is # 0)?
>

Copy/pasting Andres feedback (Thanks Andres for this feedback) on those
questions from another thread [1].

> I was also wondering if:
>
> * We should keep the old behavior in case pg_resetwal -x is being used
> without -u?
 (The proposed patch does not set an arbitrary oldestXID
> anymore in 
case -x is used)

Andres: I don't think we should. I don't see anything in the old
behaviour worth
maintaining.

> * We should ensure that the xid provided with -x or -u is
> >=
FirstNormalTransactionId (Currently the only check is that it is
> # 0)?

Andres: Applying TransactionIdIsNormal() seems like a good idea.

=> I am attaching a new version that makes use of
TransactionIdIsNormal() checks.

Andres: I think it's important to verify that the xid provided with -x
is within a reasonable range of the oldest xid.

=> What do you mean by "a reasonable range"?

Thanks

Bertrand

[1]:
https://www.postgresql.org/message-id/20210517185646.pwe4klaufwmdhe2a%40alap3.anarazel.de

Attachment Content-Type Size
v1-0002-pg-upgrade-keep-oldestxid.patch text/plain 8.9 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2021-05-18 12:24:17 Re: Logical Replication: SELECT pg_catalog.set_config Statement
Previous Message Hannes Kühtreiber 2021-05-18 09:29:38 Logical Replication: SELECT pg_catalog.set_config Statement

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-05-18 11:31:47 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Drouvot, Bertrand 2021-05-18 11:04:18 Re: prion failed with ERROR: missing chunk number 0 for toast value 14334 in pg_toast_2619