Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable
Date: 2020-08-22 15:03:09
Message-ID: 2146079.1598108589@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> On Thu, Aug 20, 2020 at 10:28 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> 2. On the other hand, the code is *releasing* the
>> ReplicationSlotControlLock before it calls
>> ProcArraySetReplicationSlotXmin, and that seems like a flat out
>> concurrency bug.

> It is not clear to me how those values can go backward.

After releasing ReplicationSlotControlLock, that code is holding no
lock at all (in the already_locked=false case I'm concerned about).
Thus the scenario to consider is:

1. Walsender A runs ReplicationSlotsComputeRequiredXmin, computes
some perfectly-valid xmins, releases ReplicationSlotControlLock,
amd then gets swapped out to Timbuktu.

2. Time passes and the "true values" of those xmins advance thanks
to other walsender activity.

3. Walsender B runs ReplicationSlotsComputeRequiredXmin, computes
some perfectly-valid xmins, and successfully stores them in the
procarray.

4. Walsender A returns from never-never land, and stores its now
quite stale results in the procarray, causing the globally visible
xmins to go backwards from where they were after step 3.

I see no mechanism in the code that prevents this scenario.
On reflection I'm not even very sure that the code change
I'm suggesting would prevent it. It'd prevent walsenders
from entering or exiting until we've updated the procarray,
but there's nothing to stop the furthest-back walsender from
advancing its values.

There may be some argument why this can't lead to a problem,
but I don't see any comments making such an argument, either.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-08-22 15:37:34 Re: xl_heap_header alignment?
Previous Message Michael Paquier 2020-08-22 12:18:14 Re: Creating a function for exposing memory usage of backend process