Re: pg_reorg in core?

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: pg_reorg in core?
Date: 2012-09-26 12:39:36
Message-ID: CAB7nPqRQcYzp4aLG0+bU1g=SQpWBe5gLyAjRgcixzOJqJ8ts9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 26, 2012 at 8:13 PM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On Tuesday, September 25, 2012 01:48:34 PM Michael Paquier wrote:
> > On Tue, Sep 25, 2012 at 5:55 PM, Andres Freund <andres(at)2ndquadrant(dot)com
> >wrote:
> > > On Tuesday, September 25, 2012 04:37:05 AM Michael Paquier wrote:
> > > > On Tue, Sep 25, 2012 at 8:13 AM, Andres Freund <
> andres(at)2ndquadrant(dot)com
> > > >
> > > >wrote:
> > > > Could you clarify what do you mean here by cleanup?
> > > > I am afraid I do not get your point here.
> > >
> > > Sorry, was a bit tired when writing the above.
> > >
> > > The point is that to work concurrent the CONCURRENT operations
> > > commit/start multiple transactions internally. It can be interrupted
> > > (user, shutdown, error,
> > > crash) and leave transient state behind every time it does so. What I
> > > wanted to
> > > say is that we need to take care that each of those can easily be
> cleaned
> > > up
> > > afterwards.
> >
> > Sure, many errors may happen.
> > But, in the case of CREATE INDEX CONCURRENTLY, there is no clean up
> method
> > implemented as far as I know (might be missing something though). Isn't
> an
> > index only considered as invalid in case of failure for concurrent
> creation?
> Well, you can DROP or REINDEX the invalid index.
>
> There are several scenarios where you can get invalid indexes. Unique
> violations, postgres restarts, aborted index creation...
>
> > In the case of REINDEX it would be essential to create such a cleanup
> > mechanism as I cannot imagine a production database with an index that
> has
> > been marked as invalid due to a concurrent reindex failure, by assuming
> here,
> > of course, that REINDEX CONCURRENTLY would use the same level of process
> > error as CREATE INDEX CONCURRENTLY.
> Not sure what youre getting at?
>
I just meant that when CREATE INDEX CONCURRENTLY fails, the index created is
considered as invalid, so it cannot be used by planner.

Based on what you told before:
1) build new index with indisready = false
newindex.indisready = true
wait
2) newindex.indisvalid = true
wait
3) swap(oldindex.relfilenode, newindex.relfilenode)
oldindex.indisvalid = false
wait
4) oldindex.indisready = false
wait
drop new index with old relfilenode

If the reindex fails at step 1 or 2, the new index is not usable so the
relation will finish
with an index which is not valid. If it fails at step 4, the old index is
invalid. If it fails at step
3, both indexes are valid and both are usable for given relation.
Do you think it is acceptable to consider that the user has to do the
cleanup of the old or new index
himself if there is a failure?

> > One of the possible cleanup mechanisms I got on top of my head is a
> > callback at transaction abort, each callback would need to be different
> for
> > each subtransaction used at during the concurrent operation.
> > In case the callback itself fails, well the old and/or new indexes become
> > invalid.
> Thats not going to work. E.g. the session might have been aborted or such.
> Also, there is not much you can do from an callback at transaction end as
> you
> cannot do catalog modifications.
>
> I was thinking of REINDEX CONCURRENTLY CONTINUE or something vaguely
> similar.
>
You could also reissue the reindex command and avoid an additional command.
When launching a
concurrent reindex, it could be possible to check if there is already an
index that has been created to replace the
old one that failed previously. In order to control that, why not adding an
additional field in pg_index?
When creating a new index concurrently, we register in its pg_index entry
the oid of the index that it has to
replace. When reissuing the command after a failure, it is then possible to
check if there is already an index that has
been issued by a previous REINDEX CONCURRENT command and based on the flag
values of the old and new
indexes it is then possible to replay the command from the step where it
previously failed.
--
Michael Paquier
http://michael.otacoo.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2012-09-26 12:43:30 autovacuum stress-testing our system
Previous Message Petr Chmelar 2012-09-26 12:34:10 Re: Enum binary access