Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

From: Josh Kupershmidt <schmiddy(at)gmail(dot)com>
To: Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Hannu Krosing <hannuk(at)google(dot)com>, Sergey Sargsyan <sergey(dot)sargsyan(dot)2001(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrey Borodin <amborodin86(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date: 2026-04-13 01:05:48
Message-ID: CAK3UJREa7qT_=QUx7Fc9D=Z2V-BAti8EXFR=9kB7yfN-bao32Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 7, 2026 at 7:20 PM Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
wrote:

> Hello, Josh!
>
> Your review looks a bit LLM-generated, but anyway - thanks for review! :)
> Especially because at least one point seems to be valid.
>
> > We're leaving behind two invalid indexes now that the user has to figure
> > out how to drop in case of an error - that seems like it could be
> confusing for the user.
> > Could we have some better way (error handler, background worker) try to
> perform this cleanup automatically?
> > If not, we should at least tell the user clearly in the error message
> that both
> > invalid indexes are left behind (i.e. "idx" and "idx_ccaux" in the
> example)
>
> Commit 0005 adds automatic dropping of auxiliary indexes when the
> original index is reindexed or dropped. Also, documentation reflects
> the ccaux index (similar to ccnew).
>

Well, we auto-drop the aux index if the user figures out that they should
drop the main index first.

> > Docs are inconsistent or confusing about whether there's one or two
> indexes left behind in case of error
> > - e.g. "command will fail but leave behind *an* invalid index and its
> associated auxiliary index"
> > somewhat implying there is only one invalid index, and somehow the
> auxiliary index is valid?
>
> Auxiliary index is never marked as valid; I'm not sure we need to
> highlight it here. Or do you have an idea how to rephrase?
>

For example in this doc change hunk:

@@ -664,12 +665,19 @@ postgres=# \d tab
col | integer | | |
Indexes:
"idx" btree (col) INVALID
+ "idx_ccaux" stir (col) INVALID
</programlisting>

- The recommended recovery
- method in such cases is to drop the index and try again to perform
- <command>CREATE INDEX CONCURRENTLY</command>. (Another possibility is
- to rebuild the index with <command>REINDEX INDEX
CONCURRENTLY</command>).
+ The recommended recovery method in such cases is to drop the index with
+ <command>DROP INDEX</command>. The auxiliary index (suffixed with
+ <literal>_ccaux</literal>) will be automatically dropped when the main
+ index is dropped. After dropping the indexes, you can try again to
perform
+ <command>CREATE INDEX CONCURRENTLY</command>. (Another possibility is
to
+ rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>,
+ which will also handle cleanup of any invalid auxiliary indexes.)
+ If the only invalid index is one suffixed <literal>_ccaux</literal>,
+ the recommended recovery method is just <literal>DROP INDEX</literal>
+ for that index.
</para>

The output we're showing the user from psql is two INVALID indexes, and
we're keeping the original doc suggestion on the first line that "The
recommended recovery method in such cases is to drop the index with DROP
INDEX". The next sentence clarifies a bit that there's an auxiliary index
that "will be automatically dropped". But now it's on the user to figure
out which index is which, and drop the right one.

> > Similarly, when the doc mentions e.g. "drop the index" - it's not
> necessarily clear which index
> > we're talking about when there are two invalid indexes left behind that
> the user will see with `\d`
>
> In one commit it says: "method in such cases is to drop these indexes
> and try again to perform".
> After 0005 "The auxiliary index (suffixed with
> <literal>_ccaux</literal>) will be automatically dropped when the main
> index is dropped".
> It seems clear to me, but feel free to provide your variant.
>
> > * It would be nice to guard against users trying arbitrary CREATE INDEX
> ... USING stir(...) with a clear error
>
> It will fail with "Building STIR indexes is not supported".
>

Sorry, you are right, this is handled with a good error.

>
> > One of the testcases (line 2478 of patch 0004) does `DELETE FROM
> concur_reindex_tab4 WHERE c1 = 1;`
> > but the table `concur_reindex_tab4` looks like it has been dropped a few
> lines above that?
>
> Hm, yep, I'll fix it.
>
> > The StirPageGetFreeSpace macro from patch 0002 reads
> `StirPageGetMaxOffset(page)`
> > which seems like it could cause an unsafe read of opaque->maxoff if used
> on the metapage
>
> But it was never used for the metapage.
>

Yes, but I think it'd be better not to leave a possible foot-gun around for
the next developer. Even just adding an AssertMacro like:

#define StirPageGetMaxOffset(page) (AssertMacro(!StirPageIsMeta(page)),
StirPageGetOpaque(page)->maxoff)

There are some other asserts for some of the trickier bookkeeping that
happens in this patch that I think would help check the code, and make it
easier to understand as well. E.g. adding an assertion check at the end of
StirPageAddItem(), and inside stirbulkdelete() (I tried calling it around
L499 of stir.c, just before 'while (itup < itupEnd)').

/*
* Validate that maxoff and pd_lower are consistent on a STIR data page.
*
* On a freshly initialized empty page, pd_lower is SizeOfPageHeaderData
* (set by PageInit). After the first insert, pd_lower is computed from
* PageGetContents which uses MAXALIGN(SizeOfPageHeaderData).
*/
static inline void
StirPageValidate(Page page)
{
Assert(!StirPageIsMeta(page));
Assert(StirPageGetOpaque(page)->maxoff == 0
? ((PageHeader) page)->pd_lower == SizeOfPageHeaderData
: ((PageHeader) page)->pd_lower ==
MAXALIGN(SizeOfPageHeaderData) +
StirPageGetOpaque(page)->maxoff * sizeof(StirTuple));
}

>
> > A comment explains "No predicate evaluation is needed here" , i.e. we
> are skipping predicate
> > evaluation in the validation scan step, assuming that the
> > auxiliary index contains only qualifying TIDs. Is this really
> bulletproof for e.g. partial indexes which may
> > no longer satisfy the predicate at the time of the validation scan due
> to conflicting HOT updates?
>
> Conflicting HOT updates are not possible because the catalog contains
> the new index definition from the start of the process.
> Or do you mean a different scenario?
>

Sorry for the false alarm, I believe you are right - I had to double
check RelationGetIndexAttrBitmap(), but I believe this is safe based on
the hotblockingattrs bitmap.

Overall, this is a nice improvement for CIC/RC that I think should help
particularly on large, busy systems.

Thanks,
Josh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2026-04-13 01:18:11 Re: StringInfo fixes, v19 edition. Plus a few oddities
Previous Message Peter Smith 2026-04-13 01:05:38 Re: Add missing period to DETAIL messages