Re: Should I implement DROP INDEX CONCURRENTLY?

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Should I implement DROP INDEX CONCURRENTLY?
Date: 2012-01-21 15:11:04
Message-ID: CA+U5nMKyHikNHJOaRkLJU3Fn+aribZ5crEuHwhuxS_sJfQiQYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 21, 2012 at 1:53 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Jan 19, 2012 at 10:02 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On Wed, Jan 18, 2012 at 11:12 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Wed, Jan 18, 2012 at 5:49 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>>> Can I just check with you that the only review comment is a one line
>>>> change? Seems better to make any additional review comments in one go.
>>>
>>> No, I haven't done a full review yet - I just noticed that right at
>>> the outset and wanted to be sure that got addressed.  I can look it
>>> over more carefully, and test it.
>>
>> Corrected your point, and another found during retest.
>>
>> Works as advertised, docs corrected.
>
> This comment needs updating:
>
>    /*
>     * To drop an index safely, we must grab exclusive lock on its parent
>     * table.  Exclusive lock on the index alone is insufficient because
>     * another backend might be about to execute a query on the parent table.
>     * If it relies on a previously cached list of index OIDs, then it could
>     * attempt to access the just-dropped index.  We must therefore take a
>     * table lock strong enough to prevent all queries on the table from
>     * proceeding until we commit and send out a shared-cache-inval notice
>     * that will make them update their index lists.
>     */

OK, I did reword some but clearly not enough.

> Speaking of that comment, why does a concurrent index drop need any
> lock at all on the parent table?  If, as the current comment text
> says, the point of locking the parent table is to make sure that no
> new backends can create relcache entries until our transaction
> commits, then it's not needed here, or at least not for that purpose.
> We're going to out-wait anyone who has the index open *after* we've
> committed our changes to the pg_index entry, so there shouldn't be a
> race condition there.  There may very well be a reason why we need a
> lock on the parent, or there may not, but that's not it.

I feel happier locking the parent as well. I'd rather avoid strange
weirdness later as has happened before in this type of discussion.

> On the flip side, it strikes me that there's considerable danger that
> ShareUpdateExclusiveLock on the index might not be strong enough.  In
> particular, it would fall down if there's anyone who takes an
> AccessShareLock, RowShareLock, or RowExclusiveLock and then proceeds
> to access the index data despite its being marked !indisready and
> !indisvalid.  There are certainly instances of this in contrib, such
> as in pgstatindex(), which I'm pretty sure will abort with a nastygram
> if somebody truncates away the file under it.  That might not be
> enough reason to reject the patch, but I do think it would be the only
> place in the system where we allow something like that to happen.  I'm
> not sure whether there are any similar risks in core - I am a bit
> worried about get_actual_variable_range() and gincostestimate(), but I
> can't tell for sure whether there is any actual problem there, or
> elsewhere.  I wonder if we should only do the *first* phase of the
> drop with ShareUpdateExcusliveLock, and then after outwaiting
> everyone, take an AccessExclusiveLock on the index (but not the table)
> to do the rest of the work.  If that doesn't allow the drop to proceed
> concurrently, then we go find the places that block against it and
> teach them not to lock/use/examine indexes that are !indisready.  That
> way, the worst case is that D-I-C is less concurrent than we'd like,
> rather than making random other things fall over - specifically,
> anything that count on our traditional guarantee that AccessShareLock
> is enough to keep the file from disappearing.

Your caution is wise. All users of an index have already checked
whether the index is usable at plan time, so although there is much
code that assumes they can look at the index itself, that is not
executed until after the correct checks.

I'll look at VACUUM and other utilities, so thanks for that.

I don't see much point in having the higher level lock, except perhaps
as a test this code works.

> In the department of minor nitpicks, the syntax synopsis you've added
> looks wrong:
>
> DROP INDEX
>       [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
> [, ...] [ CASCADE | RESTRICT ]
>       CONCURRENTLY <replaceable class="PARAMETER">name</replaceable>
>
> That implies that you may write if exists, followed by 1 or more
> names, followed by cascade or restrict.  After that you must write
> CONCURRENTLY, followed by exactly one name.  I think what you want is:
>
> DROP INDEX [ IF EXISTS ] <replaceable
> class="PARAMETER">name</replaceable> [, ...] [ CASCADE | RESTRICT ]
> DROP INDEX CONCURRENTLY <replaceable class="PARAMETER">name</replaceable>
>
> ...but I also wonder if we shouldn't support DROP INDEX [ IF EXISTS ]
> CONCURRENTLY.  It could also be very useful to be able to concurrently
> drop multiple indexes at once, since that would require waiting out
> all concurrent transactions only once instead of once per drop.
> Either way, I think we should have only one syntax, and then reject
> combinations we can't handle in the code.  So I think we should have
> the parser accept:
>
> DROP INDEX [ IF EXISTS ] [ CONCURRENTLY ] <replaceable
> class="PARAMETER">name</replaceable> [, ...] [ CASCADE | RESTRICT ]
>
> And then, if you try to use cascade, or try to drop multiple indexes
> at once, we should throw an error saying that those options aren't
> supported in combination with CONCURRENTLY, rather than rejecting them
> as a syntax error at parse time.  That gives a better error message
> and will make it easier for anyone who wants to extend this in the
> future - plus it will allow things like DROP INDEX CONCURRENTLY bob
> RESTRICT, which ought to be legal even if CASCADE isn't supported.

The syntax diagram is correct because there are restrictions there.

That's mostly because of problems trying to get the grammar clean and
conflict free, so that's the best I could do.

If you or anyone else know better how to do that, I'd appreciate some
insight there.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-01-21 15:31:21 Re: CLOG contention, part 2
Previous Message Robert Haas 2012-01-21 13:57:45 Re: CLOG contention, part 2