| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | Ajin Cherian <itsajin(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Subject: | Re: Improve pg_sync_replication_slots() to wait for primary to advance |
| Date: | 2025-12-04 06:03:50 |
| Message-ID: | CAJpy0uB8zM5xTuCHLRqqyS9o2miq_R9p8xDGonCTJQ634h+KCQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Dec 4, 2025 at 10:51 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
>
> On Wed, Dec 3, 2025 at 10:19 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, Dec 3, 2025 at 8:51 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> > >
> > > Attaching patch v28 addressing these comments.
> > >
> >
> > Can we extract the part of the patch that handles SIGUSR1 signal
> > separately as a first patch and the remaining as a second patch?
> > Please do mention the reason in the commit message as to why we are
> > changing the signal for SIGINT to SIGUSR1.
> >
>
> I have extracted out the SIGUSR1 signal handling changes separately
> into a patch and sharing. I will share the next patch later.
> Let me know if there are any comments for this patch.
>
I have just 2 trivial comments for v29-001:
1)
- * receives a SIGINT from the startup process, or when there is an error.
+ * receives a SIGUSR1 from the startup process, or when there is an error.
In above we should mention stopSignaled rather than SIGUSR1, as
SIGUSR1 is just a wakeup signal and not termination signal.
2)
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot continue replication slot synchronization"
+ " as standby promotion is triggered"));
Please mention that it is SQL-function in the comment for else-block.
~~
I tested the touched scenarios and here are the LOGs:
a)
When promotion is ongoing and the startup process has terminated
slot-sync worker but if the postmaster has not noticed that, it may
end up starting slotsync worker again. For that scenario, we get
these:
11:03:19.712 IST [151559] LOG: replication slot synchronization
worker is shutting down as promotion is triggered
11:03:19.726 IST [151629] LOG: slot sync worker started
11:03:19.795 IST [151629] LOG: replication slot synchronization
worker is shutting down as promotion is triggered
b)
On promotion, API gets this (originating from ProcessSlotSyncInterrupts now):
postgres=# SELECT pg_sync_replication_slots();
ERROR: cannot continue replication slot synchronization as standby
promotion is triggered
c)
If any parameter is changed between ValidateSlotSyncParams() and
ProcessSlotSyncInterrupts() for API, we get this:
postgres=# SELECT pg_sync_replication_slots();
ERROR: replication slot synchronization will stop because of a parameter change
--on re-run (originating from ValidateSlotSyncParams())
postgres=# SELECT pg_sync_replication_slots();
ERROR: replication slot synchronization requires
"hot_standby_feedback" to be enabled
~~
The tested scenarios' behaviour looks good to me.
thanks
Shveta
| From | Date | Subject | |
|---|---|---|---|
| Next Message | cca5507 | 2025-12-04 06:11:29 | Re: Support loser tree for k-way merge |
| Previous Message | Amit Kapila | 2025-12-04 05:58:03 | Re: Newly created replication slot may be invalidated by checkpoint |