Quick Links

Re: Support for REINDEX CONCURRENTLY

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: Support for REINDEX CONCURRENTLY
Date:	2013-03-06 04:21:27
Message-ID:	CAB7nPqQUGYnwx6hz687cPWkpJvGjy5n5LvhJKi5eOE25qbz5ig@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Please find attached updated patch realigned with your comments. You can
find my answers inline...
The only thing that needs clarification is the comment about
UNIQUE_CHECK_YES/UNIQUE_CHECK_NO. Except that all the other things are
corrected or adapted to what you wanted. I am also including now tests for
matviews.

On Wed, Mar 6, 2013 at 1:49 AM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On 2013-03-05 22:35:16 +0900, Michael Paquier wrote:
>
> > > + for (count = 0; count < num_indexes; count++)
> > > > + index_insert(toastidxs[count], t_values,
> t_isnull,
> > > > + &(toasttup->t_self),
> > > > + toastrel,
> > > > +
> > > toastidxs[count]->rd_index->indisunique ?
> > > > + UNIQUE_CHECK_YES :
> > > UNIQUE_CHECK_NO);
> > >
> > > The indisunique check looks like a copy & pasto to me, albeit not
> > > yours...
> > >
> > Yes it is the same for all the indexes normally, but it looks more solid
> to
> > me to do that as it is. So unchanged.
>
> Hm, if the toast indexes aren't unique anymore loads of stuff would be
> broken. Anyway, not your "fault".
>
I definitely cannot understand where you are going here. Could you be more
explicit? Why could this be a problem? Without my patch a similar check is
used for toast indexes.

>
> > >
> > > > + /* Obtain index list if necessary */
> > > > + if (toastRel1->rd_indexvalid == 0)
> > > > + RelationGetIndexList(toastRel1);
> > > > + if (toastRel2->rd_indexvalid == 0)
> > > > + RelationGetIndexList(toastRel2);
> > > > +
> > > > + /* Check if the swap is possible for all the toast
> indexes
> > > */
> > >
> > > So there's no error being thrown if this turns out not to be possible?
> > >
> > There are no errors also in the former process... This should fail
> > silently, no?
>
> Not sure what you mean by "former process"? So far I don't see any
> reason why it would be a good idea to fail silently. We end up with
> corrupt data if the swap is silently not performed.
>
OK added an error and a check on the size of rd_indexlist to make things
better suited.

> > > > + if (count == 0)
> > > > + snprintf(NewToastName,
> > > NAMEDATALEN, "pg_toast_%u_index",
> > > > + OIDOldHeap);
> > > > + else
> > > > + snprintf(NewToastName,
> > > NAMEDATALEN, "pg_toast_%u_index_cct%d",
> > > > + OIDOldHeap,
> > > count);
> > > > + RenameRelationInternal(lfirst_oid(lc),
> > > > +
> > > NewToastName);
> > > > + count++;
> > > > + }
> > >
> > > Hm. It seems wrong that this layer needs to know about _cct.
> > >
> > Any other idea? For the time being I removed cct and added only a suffix
> > based on the index number...
>
> Hm. It seems like throwing an error would be sufficient, that path is
> only entered for shared catalogs, right? Having multiple toast indexes
> would be a bug.
>
Don't think so. Even if now those APIs are used only for catalog tables, I
do not believe that this function has been designed to be used only with
shared catalogs. Removing the cct suffix makes sense though...

> > > > + /*
> > > > + * Index is considered as a constraint if it is PRIMARY KEY or
> > > EXCLUSION.
> > > > + */
> > > > + isconstraint = indexRelation->rd_index->indisprimary ||
> > > > + indexRelation->rd_index->indisexclusion;
> > >
> > > unique constraints aren't mattering here?
> > >
> > No they are not. Unique indexes are not counted as constraints in the
> case
> > of index_create. Previous versions of the patch did that but there are
> > issues with unique indexes using expressions.
>
> Hm. index_create's comment says:
> * isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION
> constraint
>
> There are unique indexes that are constraints and some that are
> not. Looking at ->indisunique is not sufficient to determine whether its
> one or not.
>
Hum... OK. I changed that using a method based on get_index_constraint for
a given index. So if the constraint Oid is invalid, it means that this
index has no constraints and its concurrent entry won't create an index in
consequence. It is more stable this way.

> > > +void
> > > > +index_concurrent_drop(Oid indexOid)
> > > > +{
> > > > + Oid constraintOid =
> > > get_index_constraint(indexOid);
> > > > + ObjectAddress object;
> > > > + Form_pg_index indexForm;
> > > > + Relation pg_index;
> > > > + HeapTuple indexTuple;
> > > > + bool indislive;
> > > > +
> > > > + /*
> > > > + * Check that the index dropped here is not alive, it might be
> > > used by
> > > > + * other backends in this case.
> > > > + */
> > > > + pg_index = heap_open(IndexRelationId, RowExclusiveLock);
> > > > +
> > > > + indexTuple = SearchSysCacheCopy1(INDEXRELID,
> > > > +
> > > ObjectIdGetDatum(indexOid));
> > > > + if (!HeapTupleIsValid(indexTuple))
> > > > + elog(ERROR, "cache lookup failed for index %u",
> indexOid);
> > > > + indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
> > > > + indislive = indexForm->indislive;
> > > > +
> > > > + /* Clean up */
> > > > + heap_close(pg_index, RowExclusiveLock);
> > > > +
> > > > + /* Leave if index is still alive */
> > > > + if (indislive)
> > > > + return;
> > >
> > > This seems like a confusing path? Why is it valid to get here with a
> > > valid index and why is it ok to silently ignore that case?
> > >
> > I added that because of a comment of one of the past reviews. Personally
> I
> > think it makes more sense to remove that for clarity.
>
> Imo it should be an elog(ERROR) or an Assert().
>
Assert. Added.

> > > + default:
> > > > + /* nothing to do */
> > > > + break;
> > >
> > > Shouldn't we error out?
> > >
> > Don't think so. For example what if the relation is a matview? For
> REINDEX
> > DATABASE this could finish as an error because a materialized view is
> > listed as a relation to reindex. I prefer having this path failing
> silently
> > and leave if there are no indexes.
>
> Imo default fallthroughs makes it harder to adjust code. And afaik its
> legal to add indexes to materialized views which kinda proofs my point.
> And if that path is reached for plain views, sequences or toast tables
> its an error.
>
Added an error message. Matviews are now correctly handled (per se report
from Masao).

> > > > + /*
> > > > + * Phase 3 of REINDEX CONCURRENTLY
> > > > + *
> > > > + * During this phase the concurrent indexes catch up with the
> > > INSERT that
> > > > + * might have occurred in the parent table and are marked as
> valid
> > > once done.
> > > > + *
> > > > + * We once again wait until no transaction can have the table
> open
> > > with
> > > > + * the index marked as read-only for updates. Each index
> > > validation is done
> > > > + * with a separate transaction to avoid opening transaction
> for an
> > > > + * unnecessary too long time.
> > > > + */
> > >
> > > Maybe I am being dumb because I have the feeling I said differently in
> > > the past, but why do we not need a WaitForMultipleVirtualLocks() here?
> > > The comment seems to say we need to do so.
> > >
> > Yes you said the contrary in a previous review. The purpose of this
> > function is to first gather the locks and then wait for everything at
> once
> > to reduce possible conflicts.
>
> you say:
>
> + * We once again wait until no transaction can have the table open
> with
> + * the index marked as read-only for updates. Each index
> validation is done
> + * with a separate transaction to avoid opening transaction for an
> + * unnecessary too long time.
>
> Which doesn't seem to be done?
>
> I read back and afaics I only referred to CacheInvalidateRelcacheByRelid
> not being necessary in this phase. Which I think is correct.
>
Regarding CacheInvalidateRelcacheByRelid at phase 3, I think that it is
needed. If we don't use it the pg_index entries will be updated but not the
cache, what is incorrect.

Anyway, if I claimed otherwise, I think I was wrong:
>
> The reason - I think - we need to wait here is that otherwise its not
> guaranteed that all other backends see the index with ->isready
> set. Which means they might add tuples which are invisible to the mvcc
> snapshot passed to validate_index() (just created beforehand) which are
> not yet added to the new index because those backends think the index is
> not ready yet.
> Any flaws in that logic?
>
Not that I think. In consequence, and I think we will agree on that: I am
removing WaitForMultipleVirtualLocks and add a WaitForVirtualLock on the
parent relation for EACH index before building and validating it.

> It is better like this. The end of the process needs to be done inside a
> > transaction, so not committing immediately the last drop makes sense, no?
>
> I pretty much dislike this. If we need to leave a transaction open
> (why?), that should happen a function layer above.
>
Changed as requested.

> > Not sure why UnlockRelationIdForSession needs to be run in a transaction
> > > anyway?
> > >
> > Even in the case of CREATE INDEX CONCURRENTLY,
> UnlockRelationIdForSession
> > is run inside a transaction block.
>
> I have no problem of doing so, I just dislike the way thats done in the
> loop. You can just open a new one if its required, a transaction is
> cheap, especially if it doesn't even acquire an xid.
>
OK. Doing the end of the transaction in a separate transaction and doing
the unlocking out of the transaction block...
--
Michael

Attachment	Content-Type	Size
20130306_1_remove_reltoastidxid_v4.patch	application/octet-stream	39.1 KB
20130306_2_reindex_concurrently_v18.patch	application/octet-stream	75.4 KB

In response to

Re: Support for REINDEX CONCURRENTLY at 2013-03-05 16:49:51 from Andres Freund

Responses

Re: Support for REINDEX CONCURRENTLY at 2013-03-06 08:50:39 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2013-03-06 04:23:31	Re: Request for vote to move forward with recovery.conf overhaul
Previous Message	Robert Haas	2013-03-06 04:20:17	Re: Parameterized paths vs index clauses extracted from OR clauses