Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-03-26 17:52:12
Message-ID: ZgMLTBD3xJIP9W93@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Tue, Mar 26, 2024 at 09:59:23PM +0530, Bharath Rupireddy wrote:
> On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > If we just sync inactive_since value for synced slots while in
> > recovery from the primary, so be it. Why do we need to update it to
> > the current time when the slot is being created? We don't expose slot
> > creation time, no? Aren't we fine if we just sync the value from
> > primary and document that fact? After the promotion, we can reset it
> > to the current time so that it gets its own time.
>
> I'm attaching v24 patches. It implements the above idea proposed
> upthread for synced slots. I've now separated
> s/last_inactive_time/inactive_since and synced slots behaviour. Please
> have a look.

Thanks!

==== v24-0001

It's now pure mechanical changes and it looks good to me.

==== v24-0002

1 ===

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

Tested it and it does that.

2 ===

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Tested it and it does that.

3 ===

+/*
+ * Reset the synced slots info such as inactive_since after shutting
+ * down the slot sync machinery.
+ */
+static void
+update_synced_slots_inactive_time(void)

Looks like the comment "reset" is not matching the name of the function and
what it does.

4 ===

+ /*
+ * We get the current time beforehand and only once to avoid
+ * system calls overhead while holding the lock.
+ */
+ if (now == 0)
+ now = GetCurrentTimestamp();

Also +1 of having GetCurrentTimestamp() just called one time within the loop.

5 ===

- if (!(RecoveryInProgress() && slot->data.synced))
+ if (!(InRecovery && slot->data.synced))
slot->inactive_since = GetCurrentTimestamp();
else
slot->inactive_since = 0;

Not related to this change but more the way RestoreSlotFromDisk() behaves here:

For a sync slot on standby it will be set to zero and then later will be
synchronized with the one coming from the primary. I think that's fine to have
it to zero for this window of time.

Now, if the standby is down and one sets sync_replication_slots to off,
then inactive_since will be set to zero on the standby at startup and not
synchronized (unless one triggers a manual sync). I also think that's fine but
it might be worth to document this behavior (that after a standby startup
inactive_since is zero until the next sync...).

6 ===

+ print "HI $slot_name $name $inactive_since $slot_creation_time\n";

garbage?

7 ===

+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+ my ($node, $slot_name, $slot_creation_time) = @_;
+ my $name = $node->name;

We know have capture_and_validate_slot_inactive_since at 2 places:
040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.

Worth to create a sub in Cluster.pm?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2024-03-26 18:01:58 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Previous Message Andrey M. Borodin 2024-03-26 17:26:14 Re: UUID v7