| From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
|---|---|
| To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
| Subject: | Re: Implement waiting for wal lsn replay: reloaded |
| Date: | 2026-01-12 06:53:46 |
| Message-ID: | CABPTF7U+SUnJX_woQYGe==R9Oz+-V6X0VO2stBLPGfJmH_LEhw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Alexander,
On Sat, Jan 10, 2026 at 12:47 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
>
> On Fri, Jan 9, 2026 at 9:44 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > On Fri, Jan 9, 2026 at 4:42 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Jan 8, 2026 at 6:29 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> > > > On Thu, Jan 8, 2026 at 10:19 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> > > > > I see, you were right. This is not related to the MyProc->xmin.
> > > > > ResolveRecoveryConflictWithTablespace() calls
> > > > > GetConflictingVirtualXIDs(InvalidTransactionId, InvalidOid). That
> > > > > would kill WAIT FOR LSN query independently on its xmin.
> > > >
> > > > I think the concern is valid --- conflicts like
> > > > PROCSIG_RECOVERY_CONFLICT_SNAPSHOT could occur and terminate the
> > > > backend if the timing is unlucky. It's more difficult to reproduce
> > > > though. A check for the log containing "conflict with recovery" would
> > > > likely catch these conflicts as well.
> > >
> > > Yes, I found multiple reasons why xmin gets temporarily set during
> > > processing of WAIT FOR LSN query. I'll soon post a draft patch to fix
> > > that.
> > >
> > > > > I guess your
> > > > > patch is the only way to go. It's clumsy to wrap WAIT FOR LSN call
> > > > > with retry loop, but it would still consume less resources than
> > > > > polling.
> > > > >
> > > >
> > > > Assuming recovery conflicts are relatively rare in tap tests, except
> > > > for the explicitly designed tests like 031_recovery_conflict and the
> > > > narrow timing window that the standby has not caught up while the wait
> > > > for gets invoked, a simple fallback seems appropriate to me.
> > >
> > > Yes, I see. Seems acceptable given this seems the only feasible way to go.
> > >
> >
> > Here is the updated patch with recovery conflicts handled.
>
> V2 corrected the commit message to state " if the WAIT FOR LSN session
> is interrupted by a recovery conflict (e.g., DROP TABLESPACE
> triggering conflicts on all backends),". In this case, the statement
> is canceled when possible; in some states (idle in transaction or
> subtransaction) the session may be terminated.
>
The attached patch avoids a syscache lookup while constructing the
tuple descriptor for WAIT FOR LSN, so that a catalog snapshot is not
re-established after the wait finishes.
The standard output path (printtup) may still briefly establish a
catalog snapshot during result emission, but this seems acceptable:
the snapshot window is narrow to emit a single row. A fully
catalog-free output path would require either bypassing the
DestReceiver lifecycle (breaking layering) or adding a custom receiver
(added complexity for marginal benefit). The current approach is
simpler and might be sufficient unless output-phase conflicts are
observed a lot in practice. Does this make sense to you?
--
Best,
Xuneng
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Avoid-syscache-lookup-in-WAIT-FOR-LSN-tuple-descr.patch | application/octet-stream | 2.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Pavel Stehule | 2026-01-12 06:57:39 | Re: global temporary table (GTT) - are there some ideas how to implement it? |
| Previous Message | zengman | 2026-01-12 06:53:24 | Re: Use correct macro for accessing offset numbers. |