Logical replication halted due to "this slot has been invalidated because it exceeded the maximum reserved size."

From: Viljo Hakala <Viljo(dot)Hakala(at)advania(dot)com>
To: "pgsql-admin(at)lists(dot)postgresq(dot)org" <pgsql-admin(at)lists(dot)postgresq(dot)org>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Logical replication halted due to "this slot has been invalidated because it exceeded the maximum reserved size."
Date: 2022-11-29 08:47:13
Message-ID: AS8PR05MB87416B4AED195E15F34955CB97129@AS8PR05MB8741.eurprd05.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-bugs

Hello lists,

We have two PostgreSQL 14.3 on Red Hat Linux 8.5 running

two different databases on VMS, where logical replicationn is used between databases.

Recently we have got bitten by a repeating issue in the databases

As we use logical replication between these two systems, we have

had to rebuilt logical replication, by dropping the subscriber as

that willl drop the logical replication slot on the primary and issue

does not occur for some time, but it will repeat.

This time it repeated after 3 days logical replication was rebuilt.

Last time it took 2-3 months until it repeated.

We started to get these warnings on the standby

2022-11-29 10:21:06.940 EET [1698404] ERROR: could not start WAL streaming: ERROR: cannot read from logical replication slot ”logs”

DETAIL: This slot has been invalidated because it exceeded the maximum reserved size.

2022-11-29 10:21:06.942 EET [1698368] LOG: background worker "logical replication worker" (PID 1698404) exited with exit code 1

Even though according to manual the max_slot_wal_keep_size is -1 and should not have a limit ?

If max_slot_wal_keep_size is -1 (the default), replication slots may retain an unlimited amount of WAL files

https://postgresqlco.nf/doc/en/param/max_slot_wal_keep_size/

psql (14.3)

Type "help" for help.

postgres=# show max_slot_wal_keep_size;

max_slot_wal_keep_size

------------------------

-1

(1 row)

Why is this happening? Is this a bug in PG 14.3 ?

Our fix for the time being is

DB=# alter subscription logs disable;

ALTER SUBSCRIPTION

SN4ReportingDB=# drop subscription logs

NOTICE: dropped replication slot ”logs” on publisher

DROP SUBSCRIPTION

SN4ReportingDB=# create subscription logs connection 'dbname=DBNAMAE host=192.168.1.1 port=5000 user=postgres' publication log pub with (copy_data=false);

NOTICE: created replication slot ”logs” on publisher

CREATE SUBSCRIPTION

But as this is a live system with a terabyte of data, some data will be lost unless we rebuilt the whole replication from scratch and this is not

bearable!

Any advice?

Regards,

Viljo Hakala

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Viljo Hakala 2022-11-29 08:49:42 Logical replication halted due to "this slot has been invalidated because it exceeded the maximum reserved size."
Previous Message Scott Ribe 2022-11-29 04:55:22 Re: regression in PG 15.1

Browse pgsql-bugs by date

  From Date Subject
Next Message Viljo Hakala 2022-11-29 08:49:42 Logical replication halted due to "this slot has been invalidated because it exceeded the maximum reserved size."
Previous Message Tom Lane 2022-11-29 03:31:19 Re: BUG #17700: An assert failed in prepjointree.c