Re: pg_reorg in core?

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: pg_reorg in core?
Date: 2012-09-26 11:13:03
Message-ID: 201209261313.03268.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, September 25, 2012 01:48:34 PM Michael Paquier wrote:
> On Tue, Sep 25, 2012 at 5:55 PM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:
> > On Tuesday, September 25, 2012 04:37:05 AM Michael Paquier wrote:
> > > On Tue, Sep 25, 2012 at 8:13 AM, Andres Freund <andres(at)2ndquadrant(dot)com
> > >
> > >wrote:
> > > Could you clarify what do you mean here by cleanup?
> > > I am afraid I do not get your point here.
> >
> > Sorry, was a bit tired when writing the above.
> >
> > The point is that to work concurrent the CONCURRENT operations
> > commit/start multiple transactions internally. It can be interrupted
> > (user, shutdown, error,
> > crash) and leave transient state behind every time it does so. What I
> > wanted to
> > say is that we need to take care that each of those can easily be cleaned
> > up
> > afterwards.
>
> Sure, many errors may happen.
> But, in the case of CREATE INDEX CONCURRENTLY, there is no clean up method
> implemented as far as I know (might be missing something though). Isn't an
> index only considered as invalid in case of failure for concurrent creation?
Well, you can DROP or REINDEX the invalid index.

There are several scenarios where you can get invalid indexes. Unique
violations, postgres restarts, aborted index creation...

> In the case of REINDEX it would be essential to create such a cleanup
> mechanism as I cannot imagine a production database with an index that has
> been marked as invalid due to a concurrent reindex failure, by assuming here,
> of course, that REINDEX CONCURRENTLY would use the same level of process
> error as CREATE INDEX CONCURRENTLY.
Not sure what youre getting at?

> One of the possible cleanup mechanisms I got on top of my head is a
> callback at transaction abort, each callback would need to be different for
> each subtransaction used at during the concurrent operation.
> In case the callback itself fails, well the old and/or new indexes become
> invalid.
Thats not going to work. E.g. the session might have been aborted or such.
Also, there is not much you can do from an callback at transaction end as you
cannot do catalog modifications.

I was thinking of REINDEX CONCURRENTLY CONTINUE or something vaguely similar.

> > > > 2. no support for concurrent on system tables (not easy for shared
> > > > catalogs)
> > >
> > > Doesn't this exclude all the tables that are in the schema catalog?
> >
> > No. Only SELECT array_to_string(array_agg(relname), ', ') FROM pg_class
> > WHERE relisshared AND relkind = 'r';
> > their toast tables and their indexes are shared. The problem is that for
> > those you cannot create a separate index and let it update concurrently
> > because you cannot write into each databases pg_class/pg_index.

> Yes indeed, I didn't think about things that are shared among databases.
> Blocking that is pretty simple, only a matter of places checked.

Its just a bit sad to make the thing not really appear lockless ;)

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daymel Bonne Solís 2012-09-26 11:40:02 Re: system_information.triggers & truncate triggers
Previous Message Виктор Егоров 2012-09-26 08:59:49 Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.