Re: Excessive number of replication slots for 12->14 logical replication

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Hubert Lubaczewski <depesz(at)depesz(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Excessive number of replication slots for 12->14 logical replication
Date: 2022-07-24 08:16:42
Message-ID: CAFPTHDbppsYxW3Ttg0fpYrJ8F8Zx1HVFr3Sn56XLK54C2GUYCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sun, Jul 24, 2022 at 6:12 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Jul 18, 2022 at 3:13 PM hubert depesz lubaczewski
> <depesz(at)depesz(dot)com> wrote:
> >
> > On Mon, Jul 18, 2022 at 09:07:35AM +0530, Amit Kapila wrote:
> >
> > First error:
> > #v+
> > 2022-07-18 09:22:07.046 UTC,,,4145917,,62d5263f.3f42fd,2,,2022-07-18 09:22:07 UTC,28/21641,1219146,ERROR,53400,"could not find free replication state slot for replication origin with OID 51",,"Increase max_replication_slots and try again.",,,,,,,"","logical replication worker",,0
> > #v-
> >
> > Nothing else errored out before, no warning, no fatals.
> >
> > from the first ERROR I was getting them in the range of 40-70 per minute.
> >
> > At the same time I was logging data from `select now(), * from pg_replication_slots`, every 2 seconds.
> >
> ...
> >
> > So, it looks that there are up to 10 focal slots, all active, and then there are sync slots with weirdly high counts for inactive ones.
> >
> > At most, I had 11 active sync slots.
> >
> > Looks like some kind of timing issue, which would be inline with what
> > Kyotaro Horiguchi wrote initially.
> >
>
> I think this is a timing issue similar to what Horiguchi-San has
> pointed out but due to replication origins. We drop the replication
> origin after the sync worker that has used it is finished. This is
> done by the apply worker because we don't allow to drop the origin
> till the process owning the origin is alive. I am not sure of
> repercussions but maybe we can allow dropping the origin by the
> process that owns it.
>

I have written a patch which will do the dropping of replication
origins in the sync worker itself.
I had to reset the origin session (which also resets the owned by
flag) prior to the dropping of the slots.

regards,
Ajin Cherian
Fujitsu Australia

Attachment Content-Type Size
v1-0001-fix-issue-running-out-of-replicating-origin-slots.patch application/octet-stream 2.9 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2022-07-25 01:04:06 BUG #17558: 15beta2: Endless loop with UNIQUE NULLS NOT DISTINCT and INSERT ... ON CONFLICT
Previous Message Marco Boeringa 2022-07-24 07:55:29 Re: Fwd: "SELECT COUNT(*) FROM" still causing issues (deadlock) in PostgreSQL 14.3/4?