Rework LogicalOutputPluginWriterUpdateProgress (WAS Re: Logical replication timeout ...)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: Rework LogicalOutputPluginWriterUpdateProgress (WAS Re: Logical replication timeout ...)
Date: 2023-02-09 05:51:41
Message-ID: CAA4eK1+DB66cYRRVyGcaMm7+tQ_u=q=+HWGjpu9X0pqMFWbsZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 9, 2023 at 1:33 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hacking on a rough prototype how I think this should rather look, I had a few
> questions / remarks:
>
> - We probably need to call UpdateProgress from a bunch of places in decode.c
> as well? Indicating that we're lagging by a lot, just because all
> transactions were in another database seems decidedly suboptimal.
>

We can do that but I think in all those cases we will reach quickly
enough back to walsender logic (WalSndLoop - that will send keepalive
if required) that we don't need to worry. After processing each
record, the logic will return back to the main loop that will send
keepalive if required. Also, while reading WAL if we need to block, it
will call WalSndWaitForWal() which will send keepalive if required.
The real problem we have seen in the field reports or tests is that
when we process a large transaction where changes are queued in the
reorderbuffer and while processing those we discard all or most of the
changes.

The patch calls update_progress in change_cb_wrapper and other
wrappers which will miss the case of DDLs that generates a lot of data
that is not processed by the plugin. I think for that we either need
to call update_progress from reorderbuffer.c similar to what the patch
has removed or we need some other way to address it. Do you have any
better idea?

> - Why should lag tracking only be updated at commit like points? That seems
> like it adds odd discontinuinities?
>

We have previously experimented to call it from non-commit locations
but that turned out to give inaccurate information about Lag. See
email [1].

> - The mix of skipped_xact and ctx->end_xact in WalSndUpdateProgress() seems
> somewhat odd. They have very overlapping meanings IMO.
>
> - there's no UpdateProgress calls in pgoutput_stream_abort(), but ISTM there
> should be? It's legit progress.
>

Agreed with both of the above points.

> - That's from 6912acc04f0: I find LagTrackerRead(), LagTrackerWrite() quite
> confusing, naming-wise. IIUC "reading" is about receiving confirmation
> messages, "writing" about the time the record was generated. ISTM that the
> current time is a quite poor approximation in XLogSendPhysical(), but pretty
> much meaningless in WalSndUpdateProgress()? Am I missing something?
>

Leaving it for Thomas to answer.

> - Aren't the wal_sender_timeout / 2 checks in WalSndUpdateProgress(),
> WalSndWriteData() missing wal_sender_timeout <= 0 checks?
>

It seems we are checking that via
ProcessPendingWrites()->WalSndKeepaliveIfNecessary(). Do you think we
need to check it before as well?

> - I don't really understand why f95d53edged55 added !end_xact to the if
> condition for ProcessPendingWrites(). Is the theory that we'll end up in an
> outer loop soon?
>

Yes. For non-empty xacts, we will anyway send a commit message. For
empty (skipped) xacts, we will send for synchronous replication case
to avoid any delay.

>
> Attached is a current, quite rough, prototype. It addresses some of the points
> raised, but far from all. There's also several XXXs/FIXMEs in it. I changed
> the file-ending to .txt to avoid hijacking the CF entry.
>

I have started a separate thread to avoid such confusion. I hope that
is fine with you.

> > > I don't think the syncrep logic in WalSndUpdateProgress really works as-is -
> > > consider what happens if e.g. the origin filter filters out entire
> > > transactions. We'll afaics never get to WalSndUpdateProgress(). In some cases
> > > we'll be lucky because we'll return quickly to XLogSendLogical(), but not
> > > reliably.
> >

Which case are you worried about? As mentioned in one of the previous
points I thought the timeout/keepalive handling in the callers should
be enough.

> > Is it actually the right thing to check SyncRepRequested() in that logic? It's
> > quite common to set up syncrep so that individual users or transactions opt
> > into syncrep, but to leave the default disabled.
> >
> > I don't really see an alternative to making this depend solely on
> > sync_standbys_defined.

Fair point.

How about renaming ProcessPendingWrites to WaitToSendPendingWrites or
WalSndWaitToSendPendingWrites?

[1] - https://www.postgresql.org/message-id/OS3PR01MB62755D216245199554DDC8DB9EEA9%40OS3PR01MB6275.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-02-09 05:52:02 Re: pg_upgrade failures with large partition definitions on upgrades from ~13 to 14~
Previous Message Takamichi Osumi (Fujitsu) 2023-02-09 05:50:05 RE: Exit walsender before confirming remote flush in logical replication