From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable |
Date: | 2020-08-24 06:13:40 |
Message-ID: | CAA4eK1LT=O_39MHSQZheibvwZ8A3AnTG2Fu3RFkq+9vAqyvgyQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Aug 22, 2020 at 8:33 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> > On Thu, Aug 20, 2020 at 10:28 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> 2. On the other hand, the code is *releasing* the
> >> ReplicationSlotControlLock before it calls
> >> ProcArraySetReplicationSlotXmin, and that seems like a flat out
> >> concurrency bug.
>
> > It is not clear to me how those values can go backward.
>
> After releasing ReplicationSlotControlLock, that code is holding no
> lock at all (in the already_locked=false case I'm concerned about).
> Thus the scenario to consider is:
>
> 1. Walsender A runs ReplicationSlotsComputeRequiredXmin, computes
> some perfectly-valid xmins, releases ReplicationSlotControlLock,
> amd then gets swapped out to Timbuktu.
>
> 2. Time passes and the "true values" of those xmins advance thanks
> to other walsender activity.
>
> 3. Walsender B runs ReplicationSlotsComputeRequiredXmin, computes
> some perfectly-valid xmins, and successfully stores them in the
> procarray.
>
> 4. Walsender A returns from never-never land, and stores its now
> quite stale results in the procarray, causing the globally visible
> xmins to go backwards from where they were after step 3.
>
> I see no mechanism in the code that prevents this scenario.
> On reflection I'm not even very sure that the code change
> I'm suggesting would prevent it. It'd prevent walsenders
> from entering or exiting until we've updated the procarray,
> but there's nothing to stop the furthest-back walsender from
> advancing its values.
>
I think we can prevent that if we allow
ProcArraySetReplicationSlotXmin to update the shared values only when
new xmin values follows the shared values. I am not very sure if it is
safe but I am not able to think of a problem with it. The other way
could be to always acquire ProcArrayLock before calling
ReplicationSlotsComputeRequiredXmin or before acquiring
ReplicationSlotControlLock but that seems too restrictive.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | hubert depesz lubaczewski | 2020-08-24 06:53:29 | Re: [PG13] Planning (time + buffers) data structure in explain plan (format text) |
Previous Message | Dilip Kumar | 2020-08-24 06:12:12 | Re: Re: [HACKERS] Custom compression methods |