Re: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, chenhj <chjischj(at)163(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock
Date: 2018-12-06 20:51:38
Message-ID: CAPpHfduQT2WJpE8ZwwDEFCeaLUqac9hMekOtgq=wtGaTXc+E8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Thu, Dec 6, 2018 at 10:56 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> On Tue, Dec 4, 2018 at 8:01 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2018-11-10 17:42:16 -0800, Peter Geoghegan wrote:
> > > On Wed, Nov 7, 2018 at 5:46 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > > > Teodor: Do you think that the issue is fixable? It looks like there
> > > > are serious issues with the design of 218f51584d5 to me. I don't think
> > > > the general "there can't be any inserters at this subtree" thing works
> > > > given that we have to couple buffer locks when moving right for other
> > > > reasons. We call ginStepRight() within ginFinishSplit(), for reasons
> > > > that date back to bug fix commit ac4ab97e from 2013 -- that detail is
> > > > probably important, because it seems to be what breaks the subtree
> > > > design (we don't delete in two phases or anything in GIN).
> > >
> > > Ping?
> > >
> > > This is a thorny problem, and I'd like to get your input soon. I
> > > suspect that reverting 218f51584d5 may be the ultimate outcome, even
> > > though it's a v10 feature.
> >
> > Teodor, any chance for a response here? It's not OK to commit something
> > and then just not respond to bug-reports, especially when they're well
> > diagnose and clearly point towards issues in a commit of yours.
>
> Unfortunately, Teodor is unreachable at least till next week.
>
> > Alexander, CCing you, perhaps you could point the thread out to Teodor?
>
> I'll take a look at this issue.

I've run through the thread.

So, algorithm introduced by 218f51584d5 is broken. It tries to
guarantee that there are no inserters in the subtree by placing
cleanup lock to subtree root, assuming that inserter holds pins on the
path from root to leaf. But due to concurrent splits of internal
pages the pins held can be not relevant to actual path. I don't see
the way to fix this easily. So, I think we should revert it from back
branches and try to reimplement that in master.

However, I'd like to note that 218f51584d5 introduces two changes:
1) Cleanup locking only if there pages to delete
2) Cleanup locking only subtree root
The 2nd one is broken. But the 1st one seems still good for me and
useful, because in vast majority of cases vacuum doesn't delete any
index pages. So, I propose to revert 218f51584d5, but leave there
logic, which locks root for cleanup only once there are pages to
delete. Any thoughts?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul Ramsey 2018-12-06 20:54:18 Re: Compressed TOAST Slicing
Previous Message Alvaro Herrera 2018-12-06 20:37:59 Re: minor fix in CancelVirtualTransaction