Re: REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned
Date: 2019-02-16 04:08:01
Message-ID: CAEepm=0t8o=Wh4wi0H58q3G1dqoj6ZYU-zu9DMp29RkVMGSvNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Feb 16, 2019 at 3:38 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> I saw this error once last week while stress testing to reproduce earlier bugs,
> but tentatively thought it was a downstream symptom of those bugs (since
> fixed), and now wanted to check that #15585 and others were no longer
> reproducible. Unfortunately I got this error while running same test case [2]
> as for previous bug ('could not attach').
>
> 2019-02-14 23:40:41.611 MST [32287] ERROR: cannot unpin a segment that is not pinned
>
> On commit faf132449c0cafd31fe9f14bbf29ca0318a89058 (REL_11_STABLE including
> both of last week's post-11.2 DSA patches), I reproduced twice, once within
> ~2.5 hours, once within 30min.
>
> I'm not able to reproduce on master running overnight and now 16+hours.

Oh, I think I know why: dsm_unpin_segment() containt another variant
of the race fixed by 6c0fb941 (that was for dsm_attach() being
confused by segments with the same handle that are concurrently going
away, but dsm_unpin_segment() does a handle lookup too, so it can be
confused by the same phenomenon). Untested, but the fix is probably:

diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index cfbebeb31d..23ccc59f13 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -844,8 +844,8 @@ dsm_unpin_segment(dsm_handle handle)
LWLockAcquire(DynamicSharedMemoryControlLock, LW_EXCLUSIVE);
for (i = 0; i < dsm_control->nitems; ++i)
{
- /* Skip unused slots. */
- if (dsm_control->item[i].refcnt == 0)
+ /* Skip unused slots and segments that are
concurrently going away. */
+ if (dsm_control->item[i].refcnt <= 1)
continue;

/* If we've found our handle, we can stop searching. */

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-02-16 04:10:26 Re: [PATCH v20] GSSAPI encryption support
Previous Message Andres Freund 2019-02-16 04:03:24 Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full