RE: Exit walsender before confirming remote flush in logical replication

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "peter(dot)eisentraut(at)enterprisedb(dot)com" <peter(dot)eisentraut(at)enterprisedb(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>
Subject: RE: Exit walsender before confirming remote flush in logical replication
Date: 2023-02-09 10:11:10
Message-ID: TYAPR01MB586683FC450662990E356A0EF5D99@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Osumi-san,

Thank you for reviewing! PSA new version.

> (1)
>
> + Decides the condition for exiting the walsender process.
> + <literal>'wait_flush'</literal>, which is the default, the walsender
> + will wait for all the sent WALs to be flushed on the subscriber side,
> + before exiting the process. <literal>'immediate'</literal> will exit
> + without confirming the remote flush. This may break the consistency
> + between publisher and subscriber, but it may be useful for a system
> + that has a high-latency network to reduce the amount of time for
> + shutdown.
>
> (1-1)
>
> The first part "exiting the walsender process" can be improved.
> Probably, you can say "the exiting walsender process" or
> "Decides the behavior of the walsender process at shutdown" instread.

Fixed. Second idea was chosen.

> (1-2)
>
> Also, the next sentence can be improved something like
> "If the shutdown mode is wait_flush, which is the default, the
> walsender waits for all the sent WALs to be flushed on the subscriber side.
> If it is immediate, the walsender exits without confirming the remote flush".

Fixed.

> (1-3)
>
> We don't need to wrap wait_flush and immediate by single quotes
> within the literal tag.

This style was ported from the SNAPSHOT options part, so I decided to keep.

> (2)
>
> + /* minapplydelay affects SHUTDOWN_MODE option */
>
> I think we can move this comment to just above the 'if' condition
> and combine it with the existing 'if' conditions comments.

Moved and added some comments.

> (3) 001_rep_changes.pl
>
> (3-1) Question
>
> In general, do we add this kind of check when we extend the protocol
> (STREAM_REPLICATION command)
> or add a new condition for apply worker exit ?
> In case when we would like to know the restart of the walsender process in TAP
> tests,
> then could you tell me why the new test code matches the purpose of this patch ?

The replication command is not for normal user, so I think we don't have to test itself.

The check that waits to restart the apply worker was added to improve the robustness.
I think there is a possibility to fail the test when the apply worker recevies a transaction
before it checks new subscription option. Now the failure can be avoided by
confriming to reload pg_subscription and restart.

> (3-2)
>
> + "Timed out while waiting for apply to restart after changing min_apply_delay
> to non-zero value";
>
> Probably, we can partly change this sentence like below, because we check
> walsender's pid.
> FROM: "... while waiting for apply to restart..."
> TO: "... while waiting for the walsender to restart..."

Right, fixed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v6-0001-Time-delayed-logical-replication-subscriber.patch application/octet-stream 75.8 KB
v6-0002-Extend-START_REPLICATION-command-to-accept-walsen.patch application/octet-stream 15.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-02-09 10:12:03 RE: Exit walsender before confirming remote flush in logical replication
Previous Message Aleksander Alekseev 2023-02-09 10:06:04 Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent