Suppress generating WAL records during the upgrade

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Suppress generating WAL records during the upgrade
Date: 2023-08-08 10:43:06
Message-ID: TYAPR01MB58660273EACEFC5BF256B133F50DA@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear hackers,
(CC: Julien, Sawada-san, Amit)

This is a fork thread from "[PoC] pg_upgrade: allow to upgrade publisher node" [1].

# Background

[1] is the patch which allows to replicate logical replication slots from old to new node.
Followings describe the rough steps:

1. Boot old node as binary-upgrade mode
2. Check confirmed_lsn of all the slots, and confirm all WALs are replicated to downstream
3. Dump slot info to sql file
4. Stop old node
5. Boot new node as binary-upgrade mode
...

Here, step 2 was introduced for avoiding data loss. If there are some WAL records
ahead confirmed_lsn, such records would not be replicated anymore - it may be dangerous.

So in the current patch, pg_upgrade fails if other records than SHUTDOWN_CHECKPOINT
exits after any confirmed_flush_lsn.

# Problem

We found that following three records might be generated during the upgrade.

* RUNNING_XACT
* CHECKPOINT_ONLINE
* XLOG_FPI_FOR_HINT

RUNNING_XACT might be written by the background writer. Conditions for the generation are:

a. Elapsed 15 seconds since the last WAL creation or bootstraping of the process, and either of them:
b-1. The process had never create the RUNNING_XACT record, or
b-2. Some "important WALs" were created after the last RUNNING_XACT record

CHECKPOINT_ONLINE might be written by the checkpointer. Conditions for the generation are:

a. Elapsed checkpoint_timeout seconds since the last creation or bootstraping, and either of them:
b-1. The process had never create the CHECKPOINT_ONLINE record, or
b-2. Some "important WALs" were created after the last CHECKPOINT record

XLOG_FPI_FOR_HINT, which is raised by Sawada-san, might be generated by backend processes.
Conditions for the generation are:

a. Backend processes scanned any tuples (even if it was the system catalog), or either of them:
b-1. Data checksum was enabled, or
b-2. wal_log_hints was set to on

# Solution

I wanted to suppress generations of WALs during the upgrade, because of the "# Background".

Regarding the RUNNING_XACT and CHECKPOINT_ONLINE, it might be OK by removing the
condition b-1. The duration between bootstrap and initial {RUNNING_XACT|CHECKPOINT_ONLINE}
becomes longer, but I could not find impacts by it.

As for the XLOG_FPI_FOR_HINT, the simplest way I came up with is not to call
XLogSaveBufferForHint() during binary upgrade. Considerations may be not enough,
but I attached the patch for the fix. It passed CI on my repository.

Do you have any other considerations about it?
An approach, which adds "if (IsBinaryUpgare)" in XLogInsertAllowed(), was proposed in [2].
But I'm not sure it could really solve the issue - e.g., XLogInsertRecord() just
raised an ERROR if !XLogInsertAllowed().

[1]: https://commitfest.postgresql.org/44/4273/
[2]: https://www.postgresql.org/message-id/flat/20210121152357.s6eflhqyh4g5e6dv%40dalibo.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
0001-Suppress-generating-WAL-records-during-the-upgrade.patch application/octet-stream 2.6 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2023-08-08 11:18:51 Re: Avoid stack frame setup in performance critical routines using tail calls
Previous Message Richard Guo 2023-08-08 10:38:30 Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning