From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tomas Vondra <tomas(at)vondra(dot)me>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Implement waiting for wal lsn replay: reloaded |
Date: | 2025-08-09 11:27:25 |
Message-ID: | CABPTF7VsoGDMBq34MpLrMSZyxNZvVbgH6-zxtJOg5AwOoYURbw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
> On Thu, Aug 7, 2025 at 6:01 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> > Thanks for working on this.
> >
> > I’ve just come across this thread and haven’t had a chance to dig into
> > the patch yet, but I’m keen to review it soon.
>
> Great. Thank you for your attention to this patch. I appreciate your
> intention to review it.
I did a quick pass over v7. There are a few thoughts to share—mostly
around documentation, build, and tests, plus some minor nits. The core
logic looks solid to me. I’ll take a deeper look as I work on a
follow‑up patch to add waiting for flush LSNs. And the patch seems to
need rebase; it can't be applied to HEAD cleanly for now.
Build
1) Consider adding a comma in `src/test/recovery/meson.build` after
`'t/048_vacuum_horizon_floor.pl'` so the list remains valid.
Core code
2) It may be safer for `WaitLSNWakeup()` to assert against the stack array size:
) Perhaps `Assert(numWakeUpProcs < WAKEUP_PROC_STATIC_ARRAY_SIZE);`
rather than `MaxBackends`.
For option parsing UX in `wait.c`, we might prefer:
3) Using `ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR),
errmsg(...)))` instead of `elog(ERROR, ...)` for consistency and
translatability.
4) Explicitly rejecting duplicate `LSN`/`TIMEOUT` options with a syntax error.
5) The result column label could align better with other utility
outputs if shortened to `status` (lowercase, no space).
6) After `parse_real()`, it could help to validate/clamp the timeout
to avoid overflow when converting to `int64` and when passing a `long`
to `WaitLatch()`.
7) If `nodes/print.h` in `src/backend/commands/wait.c` isn’t used, we
might drop the include.
8) A couple of comment nits: “do it this outside” → “do this outside”.
Tests
9) We might consider adding cases for:
- Negative `TIMEOUT` (to exercise the error path).
- Syntax errors (unknown option; duplicate `LSN`/`TIMEOUT`; missing `LSN`).
Documentation
`doc/src/sgml/ref/wait_for.sgml`
10) The index term could be updated to `<primary>WAIT FOR</primary>`.
11) The synopsis might read more clearly as:
- WAIT FOR LSN '<lsn>' [ TIMEOUT <milliseconds |
'duration-with-units'> ] [ NO_THROW ]
12) The purpose line might be smoother as “wait for a target LSN to be
replayed, optionally with a timeout”.
13) Return values might use `<literal>` for `success`, `timeout`, `not
in recovery`.
14) Consistently calling this a “command” (rather than
function/procedure) could reduce confusion.
15) The example text might read more cleanly as “If the target LSN is
not reached before the timeout …”.
`doc/src/sgml/high-availability.sgml`
16) The sentence could read “However, it is possible to address this
without switching to synchronous replication.”
`src/backend/utils/activity/wait_event_names.txt`
17) The description for `WAIT_FOR_WAL_REPLAY` might be clearer as
“Waiting for WAL replay to reach a target LSN on a standby.”
Best,
Xuneng
From | Date | Subject | |
---|---|---|---|
Next Message | Joe Conway | 2025-08-09 11:45:16 | Re: Making type Datum be 8 bytes everywhere |
Previous Message | Xuneng Zhou | 2025-08-09 10:52:37 | Re: Implement waiting for wal lsn replay: reloaded |