Re: Potential GIN vacuum bug

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Potential GIN vacuum bug
Date: 2015-08-30 17:47:47
Message-ID: CAMkU=1zsAe+WhaWwTMTU8LQgFSVfox7Bf+tpVDmPq6C-+VkUug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 22, 2015 at 11:25 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Tue, Aug 18, 2015 at 8:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
>
>> On Mon, Aug 17, 2015 at 5:41 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> > User backends attempt to take the lock conditionally, because otherwise
>> they
>> > would cause an autovacuum already holding the lock to cancel itself,
>> which
>> > seems quite bad.
>> >
>> > Not that this a substantial behavior change, in that with this code the
>> user
>> > backends which find the list already being cleaned will just add to the
>> end
>> > of the pending list and go about their business. So if things are
>> added to
>> > the list faster than they can be cleaned up, the size of the pending
>> list
>> > can increase without bound.
>> >
>> > Under the existing code each concurrent user backend will try to clean
>> the
>> > pending list at the same time. The work doesn't parallelize, so doing
>> this
>> > is just burns CPU (and possibly consuming up to maintenance_work_mem
>> for
>> > *each* backend) but it does server to throttle the insertion rate and so
>> > keep the list from growing without bound.
>> >
>> > This is just a proof-of-concept patch, because I don't know if this
>> approach
>> > is the right approach.
>>
>> I'm not sure if this is the right approach, but I'm a little wary of
>> involving the heavyweight lock manager in this. If pending list
>> cleanups are frequent, this could involve a lot of additional lock
>> manager traffic, which could be bad for performance.
>
>
>
>> Even if they are
>> infrequent, it seems like it would be more natural to handle this
>> without some regime of locks and pins and buffer cleanup locks on the
>> buffers that are storing the pending list, rather than a heavyweight
>> lock on the whole relation. But I am just waving my hands wildly
>> here.
>>
>
> I also thought of a buffer clean up lock on the pending list head buffer
> to represent the right to do a clean up. But with the proviso that once
> you have obtained the clean up lock, you can then drop the exclusive buffer
> content lock and continue to hold the conceptual lock just by maintaining
> the pin. I think that this would be semantically correct, but backends
> doing a cleanup would have to get the lock conditionally, and I think you
> would have too many chances for false failures where it bails out when the
> other party simply holds a pin. I guess I could implement it and see how
> it fairs in my test case.
>

Attached is a patch to deal with this without the heavyweight locks.

I realized that using the clean up lock on the meta page, rather than the
pending head page, would be easier to implement as a pin was already held
on the meta page throughout, and the pending head page can change during
the cleanup if we need multiple passes.

Also, I think the clean up lock on the metapage should actually be easier.
All queries need to visit the pending head, and they hold it long enough to
check all the keys (possibly hundreds, checked with arbitrarily slow SQL
functions) on that page. The metapage is only checked for a few variables
which are C types.

I thought of checking for metadata->head == InvalidBlockNumber with a
sharelock before getting the clean-up lock and then again after, but decide
against it as the user backends already check that immediately before
calling this function, and wouldn't call it if there was no pending list
as-of that check.

I exchange the exclusive context lock given to us by
the LockBufferForCleanup for a share content lock, so as to not hold the
exclusive lock over the IO possibly needed to read the pending head page
into buffers. I don't know that this is actually a win.

This fixed the same problem I was seeing that was fixed by the previous
heavy-weight lock patch.

I still prefer the heavy-weight approach. The buffer clean up lock for
vacuuming seems fragile to start with, and abusing it for other purposes
doesn't improve on that.

Whichever approach is taken, more work is needed on the comments. And the
code that currently checks for concurrent cleanup and bails out needs to be
changed to throw errors or something instead. But I don't want to make too
many changes until I know which approach to take, and whether it will be
back-patched.

Cheers,

Jeff

Attachment Content-Type Size
gin_pending_lwlock.patch application/octet-stream 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-08-30 18:11:28 Re: Potential GIN vacuum bug
Previous Message Tom Lane 2015-08-30 15:51:14 Re: SimpleTee flush