Re: REINDEX CONCURRENTLY 2.0

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Jim Nasby <jim(at)nasby(dot)net>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REINDEX CONCURRENTLY 2.0
Date: 2017-02-13 05:31:01
Message-ID: CAB7nPqRZR-Wf-Z6XfRcv3NRUpC0rGofea538XtOWNzC3R_cTFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 12, 2017 at 6:44 AM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> On 02/02/2015 03:10 PM, Andres Freund wrote:
>> I think if we should instead just use the new index, repoint the
>> dependencies onto the new oid, and then afterwards, when dropping,
>> rename the new index one onto the old one. That means the oid of the
>> index will change and some less than pretty grovelling around
>> dependencies, but it still seems preferrable to what we're discussing
>> here otherwise.
>
> I think that sounds like a good plan. The oid change does not seem like a
> too big deal to me, especially since that is what users will get now too. Do
> you still think this is the right way to solve this?

That hurts mainly system indexes. Perhaps users with broken system
indexes are not going to care about concurrency anyway. Thinking now
about it I don't see how that would not work, but I did not think
deeply about this problem lately.

> I have attached my work in progress patch which implements and is very
> heavily based on Michael's previous work. There are some things left to do
> but I think I should have a patch ready for the next commitfest if people
> still like this type of solution.

Cool to see a rebase of this patch. It's been a long time...

> I also changed index_set_state_flags() to be transactional since I wanted
> the old index to become invalid at exactly the same time as the new becomes
> valid. From reviewing the code that seems like a safe change.
>
> A couple of bike shedding questions:
>
> - Is the syntax "REINDEX <type> CONCUURENTLY <object>" ok?

Yeah, that's fine. At least that's what has been concluded in previous threads.

> - What should we do with REINDEX DATABASE CONCURRENTLY and the system
> catalog? I so not think we can reindex the system catalog concurrently
> safely, so what should REINDEX DATABASE do with the catalog indexes? Skip
> them, reindex them while taking locks, or just error out?

System indexes cannot have their OIDs changed as they are used in
syscache lookups. So just logging a warning looks fine to me, and the
price to pay to avoid taking an exclusive lock even for a short amount
of time.

> - What level of information should be output in VERBOSE mode?

Er, something like that as well, no?
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

> What remains to be implemented:
> - Support for exclusion constraints
> - Look more into how I handle constraints (currently the temporary index too
> seems to have the PRIMARY KEY flag)
> - Support for the VERBOSE flag
> - More testing to catch bugs

This is a crasher:
create table aa (a int primary key);
reindex (verbose) schema concurrently public ;

For invalid indexes sometimes snapshots are still active (after
issuing the previous crash for example):
=# reindex (verbose) table concurrently aa;
WARNING: XX002: cannot reindex concurrently invalid index
"public.aa_pkey_cct", skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2119
WARNING: 01000: snapshot 0x7fde12003038 still active
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-02-13 05:46:30 Re: possibility to specify template database for pg_regress
Previous Message Amit Langote 2017-02-13 05:21:50 Re: Documentation improvements for partitioning