Re: Support for REINDEX CONCURRENTLY

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Support for REINDEX CONCURRENTLY
Date: 2013-03-06 11:59:37
Message-ID: CAB7nPqTuR_7gqxU3UCt_oYNHsjRydmm4KxGoPs=_DtvPtrhk9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

OK. Patches updated... Please see attached.
With all the work done on those patches, I suppose this is close to being
something clean...

On Wed, Mar 6, 2013 at 5:50 PM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On 2013-03-06 13:21:27 +0900, Michael Paquier wrote:
> > Hum... OK. I changed that using a method based on get_index_constraint
> for
> > a given index. So if the constraint Oid is invalid, it means that this
> > index has no constraints and its concurrent entry won't create an index
> in
> > consequence. It is more stable this way.
>
> Sounds good. Just to make that clear:
> To get a unique index without constraint:
> CREATE TABLE table_u(id int, data int);
> CREATE UNIQUE INDEX table_u__data ON table_u(data);
> To get a constraint:
> ALTER TABLE table_u ADD CONSTRAINT table_u__id_unique UNIQUE(id);
>
OK no problem. Thanks for the clarification.

> > > > > > + /*
> > > > > > + * Phase 3 of REINDEX CONCURRENTLY
> > > > > > + *
> > > > > > + * During this phase the concurrent indexes catch up with
> the
> > > > > INSERT that
> > > > > > + * might have occurred in the parent table and are marked
> as
> > > valid
> > > > > once done.
> > > > > > + *
> > > > > > + * We once again wait until no transaction can have the
> table
> > > open
> > > > > with
> > > > > > + * the index marked as read-only for updates. Each index
> > > > > validation is done
> > > > > > + * with a separate transaction to avoid opening transaction
> > > for an
> > > > > > + * unnecessary too long time.
> > > > > > + */
> > > > >
> > > > > Maybe I am being dumb because I have the feeling I said
> differently in
> > > > > the past, but why do we not need a WaitForMultipleVirtualLocks()
> here?
> > > > > The comment seems to say we need to do so.
> > > > >
> > > > Yes you said the contrary in a previous review. The purpose of this
> > > > function is to first gather the locks and then wait for everything at
> > > once
> > > > to reduce possible conflicts.
> > >
> > > you say:
> > >
> > > + * We once again wait until no transaction can have the table
> open
> > > with
> > > + * the index marked as read-only for updates. Each index
> > > validation is done
> > > + * with a separate transaction to avoid opening transaction
> for an
> > > + * unnecessary too long time.
> > >
> > > Which doesn't seem to be done?
> > >
> > > I read back and afaics I only referred to
> CacheInvalidateRelcacheByRelid
> > > not being necessary in this phase. Which I think is correct.
> > >
> > Regarding CacheInvalidateRelcacheByRelid at phase 3, I think that it is
> > needed. If we don't use it the pg_index entries will be updated but not
> the
> > cache, what is incorrect.
>
> A heap_update will cause cache invalidations to be sent.
>
Ok. removed it.

> > Anyway, if I claimed otherwise, I think I was wrong:
> > >
> > > The reason - I think - we need to wait here is that otherwise its not
> > > guaranteed that all other backends see the index with ->isready
> > > set. Which means they might add tuples which are invisible to the mvcc
> > > snapshot passed to validate_index() (just created beforehand) which are
> > > not yet added to the new index because those backends think the index
> is
> > > not ready yet.
> > > Any flaws in that logic?
> > >
> > Not that I think. In consequence, and I think we will agree on that: I am
> > removing WaitForMultipleVirtualLocks and add a WaitForVirtualLock on the
> > parent relation for EACH index before building and validating it.
>
> I have the feeling we are talking past each other. Unless I miss
> something *there is no* WaitForMultipleVirtualLocks between phase 2 and
> 3. But one WaitForMultipleVirtualLocks for all would be totally
> sufficient.
>
OK, sorry for the confusion. I added a call to WaitForMultipleVirtualLocks
also before phase 3.
Honestly, I am still not very comfortable with the fact that the ShareLock
wait on parent relation is done outside each index transaction for build
and validation... Changed as requested though...
--
Michael

Attachment Content-Type Size
20130306_1_remove_reltoastidxid_v4.patch application/octet-stream 39.1 KB
20130306_2_reindex_concurrently_v19.patch application/octet-stream 75.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-03-06 12:09:43 Re: Support for REINDEX CONCURRENTLY
Previous Message Shigeru Hanada 2013-03-06 11:37:03 Re: Writable foreign tables: how to identify rows