Re: Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl
Date: 2016-02-16 00:41:28
Message-ID: CA+Tgmob6OM-2F+nzJyY22D0nF3WDEBvVzHgFtCnP89K=vkGcQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Sun, Feb 14, 2016 at 5:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Sun, Feb 14, 2016 at 1:33 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> pgprocno of the specific PGPROC, or of the group leader? If it's
>>> the former, I'm pretty suspicious that there are deadlock and/or
>>> linked-list-corruption hazards here. If it's the latter, at least
>>> the comments around this are misleading.
>
>> Leader. The other way would be nuts.
>
> ... and btw, either BecomeLockGroupMember() has got this backwards, or
> I'm still fundamentally confused.

Yeah, you're right. Attached is a draft patch that tries to clean up
that and a bunch of other things that you raised. It hasn't really
been tested yet, so it might be buggy; I wrote it during my long plane
flight. Fixes include:

1. It removes the groupLeader flag from PGPROC. You're right: we don't need it.

2. It fixes this problem with BecomeLockGroupMember(). You're right:
the old way was backwards.

3. It fixes TopoSort() to be less inefficient by checking whether the
EDGE is for the correct lock before doing anything else. I realized
this while updating comments related to EDGE.

4. It adds some text to the lmgr README.

5. It reflows the existing text in the lmgr README to fit within 80 columns.

6. It adjusts some other comments.

Possibly this should be broken up into multiple patches, but I'm just
sending it along all together for now.

> Also, after fixing that it would be good to add a comment explaining why
> it's not fundamentally unsafe for BecomeLockGroupMember() to examine
> leader->pgprocno without having any relevant lock. AFAICS, that's only
> okay because the pgprocno fields are never changed after InitProcGlobal,
> even when a PGPROC is recycled.

I am sort of disinclined to think that this needs a comment. Do we
really have a comment every other place that pgprocno is referenced
without a lock? Or maybe there are none, but it seems like overkill
to me to comment on that at every site of use. Better to comment on
the pgprocno definition itself and say "never changes".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
group-locking-fixes-v1.patch application/x-patch 17.7 KB

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2016-02-16 00:55:45 Re: Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl
Previous Message Joe Conway 2016-02-16 00:40:35 pgsql: Move DATA entry to correct position

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2016-02-16 00:51:03 Re: A bit of PG archeology uncovers an interesting Linux/Unix factoid
Previous Message Noah Misch 2016-02-16 00:38:44 Re: postgres_fdw vs. force_parallel_mode on ppc