Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-03-15 12:05:27
Message-ID: CALj2ACU=Q5JG1VQFWFQe0VjPkOu2kt-6gpq7z1Ed9+JX_BR-tQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 15, 2024 at 10:15 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> > > > wal_level_insufficient it's the reason for conflict with recovery".
>
> +1 on maintaining both conflicting and invalidation_reason

Thanks.

> Thanks for the patch. JFYI, patch09 does not apply to HEAD, some
> recent commit caused the conflict.

Yep, the conflict is in src/test/recovery/meson.build and is because
of e6927270cd18d535b77cbe79c55c6584351524be.

> Some trivial comments on patch001 (yet to review other patches)

Thanks for looking into this.

> 1)
> info.c:
>
> - "%s as caught_up, conflict_reason IS NOT NULL as invalid "
> + "%s as caught_up, invalidation_reason IS NOT NULL as invalid "
>
> Can we revert back to 'conflicting as invalid' since it is a query for
> logical slots only.

I guess, no. There the intention is to check for invalid logical slots
not just for the conflicting ones. The logical slots can get
invalidated due to other reasons as well.

> 2)
> 040_standby_failover_slots_sync.pl:
>
> - q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM
> pg_replication_slots WHERE slot_name = 'lsub1_slot';}
> + q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary
> FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
>
> Here too, can we have 'NOT conflicting' instead of '
> invalidation_reason IS NULL' as it is a logical slot test.

I guess no. The tests are ensuring the slot on the standby isn't invalidated.

In general, one needs to use the 'conflicting' column from
pg_replication_slots when the intention is to look for reasons for
conflicts, otherwise use the 'invalidation_reason' column for
invalidations.

Please see the attached v10 patch set after resolving the merge
conflict and fixing an indentation warning in the TAP test file.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v10-0001-Track-invalidation_reason-in-pg_replication_slot.patch application/x-patch 19.8 KB
v10-0002-Add-XID-age-based-replication-slot-invalidation.patch application/x-patch 12.9 KB
v10-0003-Track-inactive-replication-slot-information.patch application/x-patch 10.0 KB
v10-0004-Add-inactive_timeout-based-replication-slot-inva.patch application/x-patch 11.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2024-03-15 12:10:14 Re: Weird test mixup
Previous Message Tomas Vondra 2024-03-15 11:38:26 Re: Add bump memory context type and use it for tuplesorts