Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable
Date: 2020-08-22 10:46:51
Message-ID: CAA4eK1KSe-NrfiRd2cspy-8HAZ3NwXMefE4tw20BnAx53KE2iA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 20, 2020 at 10:28 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> While trying to make sense of Adam Sjøgren's problem [1], I found
> myself staring at ReplicationSlotsComputeRequiredXmin() in slot.c.
> It seems to me that that is very shaky code, on two different
> grounds:
>
> 1. Sometimes it's called with ProcArrayLock already held exclusively.
> This means that any delay in acquiring the ReplicationSlotControlLock
> translates directly into a hold on ProcArrayLock; in other words,
> every acquisition of the ReplicationSlotControlLock is just as bad
> for concurrency as an acquisition of ProcArrayLock. While I didn't
> see any places that were doing really obviously slow things while
> holding ReplicationSlotControlLock, this is disturbing. Do we really
> need it to be like that?
>
> 2. On the other hand, the code is *releasing* the
> ReplicationSlotControlLock before it calls
> ProcArraySetReplicationSlotXmin, and that seems like a flat out
> concurrency bug. How can we be sure that the values we're storing
> into the shared xmin fields aren't stale by the time we acquire
> the ProcArrayLock (in the case where we didn't hold it already)?
> I'm concerned that in the worst case this code could make the
> shared xmin fields go backwards.
>

It is not clear to me how those values can go backward. Basically, we
install those values in slots after getting it from
GetOldestSafeDecodingTransactionId() and then those always seem to get
advanced. And GetOldestSafeDecodingTransactionId() takes into account
the already stored shared values of replication_slot_xmin and
replication_slot_catalog_xmin for computing the xmin_horizon for slot.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-08-22 12:18:14 Re: Creating a function for exposing memory usage of backend process
Previous Message Konstantin Knizhnik 2020-08-22 07:16:28 INSERT ON CONFLICT and RETURNING