RE: [PoC] pg_upgrade: allow to upgrade publisher nodeHayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Subject: RE: [PoC] pg_upgrade: allow to upgrade publisher nodeHayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>
Date: 2023-10-06 13:00:13
Message-ID: TYAPR01MB58666A2161F173EC87D956C0F5C9A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear hackers,

Based on comments, I revised my patch. PSA the file.

>
> > Today, I discussed this problem with Andres at PGConf NYC and he
> > suggested as following. To verify, if there is any pending unexpected
> > WAL after shutdown, we can have an API like
> > pg_logical_replication_slot_advance() which will simply process
> > records without actually sending anything downstream. In this new API,
> > we will start with each slot's restart_lsn location and try to process
> > till the end of WAL, if we encounter any WAL that needs to be
> > processed (like we need to send the decoded WAL downstream) we can
> > return a false indicating that there is an unexpected WAL. The reason
> > to start with restart_lsn is that it is the location that we use to
> > start scanning the WAL anyway.

I implemented this by using decoding context. The binary upgrade function
processes WALs from the confirmed_flush, and returns false if some meaningful
changes are found.

Internally, I added a new decoding mode - DECODING_MODE_SILENT - and used it.
If the decoding context is in the mode, the output plugin is not loaded, but
any WALs are decoded without skipping. Also, a new flag "did_process" is also
added. This flag is set if wrappers for output plugin callbacks are called during
the silent mode. The upgrading function checks both reorder buffer and the new
flag because both (non-)transactional changes should be detected. If we only
check reorder buffer, we miss the non-transactional one.

fast_forward was changed as a variant of decoding mode.

Currently the function is called for all the valid slot. If the approach seems
good, we can refactor like Bharath said [1].

>
> > Then, we should also try to create slots before invoking pg_resetwal.
> > The idea is that we can write a new binary mode function that will do
> > exactly what pg_resetwal does to compute the next segment and use that
> > location as a new location (restart_lsn) to create the slots in a new
> > node. Then, pass it pg_resetwal by using the existing option '-l
> > walfile'. As we don't have any API that takes restart_lsn as input, we
> > can write a new API probably for binary mode to create slots that do
> > take restart_lsn as input. This will ensure that there is no new WAL
> > inserted by background processes between resetwal and the creation of
> > slots.

Based on that, I added another binary function binary_upgrade_create_logical_replication_slot().
This function is similar to pg_create_logical_replication_slot(), but the
restart_lsn and confirmed_flush are set to *next* WAL segment. The pointed
filename is returned and it is passed to pg_resetwal command.

One consideration is that pg_log_standby_snapshot() must be executed before
slots consuming changes. New cluster does not have RUNNING_XACTS records so that
decoding context on new cluster cannot be create a consistent snapshot as-is.
This may lead to discard changes during the upcoming consuming event. To
prevent it the function is called after the final pg_resetwal.

How do you think?

Acknowledgment: I would like to thank Hou for discussing with me.

[1]: https://www.postgresql.org/message-id/CALj2ACWAdYxgzOpXrP%3DJMiOaWtAT2VjPiKw7ryGbipkSkocJ%3Dg%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v46-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch application/octet-stream 69.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2023-10-06 13:08:55 Re: Opportunistically pruning page before update
Previous Message torikoshia 2023-10-06 12:58:46 Re: RFC: Logging plan of the running query