[PoC] pg_upgrade: allow to upgrade publisher node

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: 'Julien Rouhaud' <rjuju123(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-04-04 07:00:01
Message-ID: TYAPR01MB58664C81887B3AF2EB6B16E3F5939@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear hackers,
(CC: Amit and Julien)

This is a fork thread of Julien's thread, which allows to upgrade subscribers
without losing changes [1].

I briefly implemented a prototype for allowing to upgrade publisher node.
IIUC the key lack was that replication slots used for logical replication could
not be copied to new node by pg_upgrade command, so this patch allows that.
This feature can be used when '--include-replication-slot' is specified. Also,
I added a small test for the typical case. It may be helpful to understand.

Pg_upgrade internally executes pg_dump for dumping a database object from the old.
This feature follows this, adds a new option '--slot-only' to pg_dump command.
When specified, it extracts needed info from old node and generate an SQL file
that executes pg_create_logical_replication_slot().

The notable deference from pre-existing is that restoring slots are done at the
different time. Currently pg_upgrade works with following steps:

...
1. dump schema from old nodes
2. do pg_resetwal several times to new node
3. restore schema to new node
4. do pg_resetwal again to new node
...

The probem is that if we create replication slots at step 3, the restart_lsn and
confirmed_flush_lsn are set to current_wal_insert_lsn at that time, whereas
pg_resetwal discards the WAL file. Such slots cannot extracting changes.
To handle the issue the resotring is seprarated into two phases. At the first phase
restoring is done at step 3, excepts replicatin slots. At the second phase
replication slots are restored at step 5, after doing pg_resetwal.

Before upgrading a publisher node, all the changes gerenated on publisher must
be sent and applied on subscirber. This is because restart_lsn and confirmed_flush_lsn
of copied replication slots is same as current_wal_insert_lsn. New node resets
the information which WALs are really applied on subscriber and restart.
Basically it is not problematic because before shutting donw the publisher, its
walsender processes confirm all data is replicated. See WalSndDone() and related code.

Currently physical slots are ignored because this is out-of-scope for me.
I did not any analysis about it.

[1]: https://www.postgresql.org/message-id/flat/20230217075433.u5mjly4d5cr4hcfe%40jrouhaud

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
0001-pg_upgrade-Add-include-replication-slot-option.patch application/octet-stream 21.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Quan Zongliang 2023-04-04 07:38:37 Re: Why enable_hashjoin Completely disables HashJoin
Previous Message Richard Guo 2023-04-04 06:47:19 Re: same query but different result on pg16devel and pg15.2