Re: REINDEX backend filtering

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: REINDEX backend filtering
Date: 2020-12-15 17:34:16
Message-ID: CABUevEyqGK722MRN5kFemdChfd3gquqcx4M-zGgtHj7n6y1X3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 15, 2020 at 12:22 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Mon, Dec 14, 2020 at 3:45 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >
> > On Thu, Dec 03, 2020 at 05:31:43PM +0800, Julien Rouhaud wrote:
> > > Now that we have the infrastructure to track indexes that might be corrupted
> > > due to changes in collation libraries, I think it would be a good idea to offer
> > > an easy way for users to reindex all indexes that might be corrupted.
> >
> > Yes. It would be a good thing.
> >
> > > The filter is also implemented so that you could cumulate multiple filters, so
> > > it could be easy to add more filtering, for instance:
> > >
> > > REINDEX (COLLATION 'libc', COLLATION 'not_current') DATABASE mydb;
> > >
> > > to only rebuild indexes depending on outdated libc collations, or
> > >
> > > REINDEX (COLLATION 'libc', VERSION 'X.Y') DATABASE mydb;
> > >
> > > to only rebuild indexes depending on a specific version of libc.
> >
> > Deciding on the grammar to use depends on the use cases we would like
> > to satisfy. From what I heard on this topic, the goal is to reduce
> > the amount of time necessary to reindex a system so as REINDEX only
> > works on indexes whose dependent collation versions are not known or
> > works on indexes in need of a collation refresh (like a reindexdb
> > --all --collation -j $jobs). What would be the benefit in having more
> > complexity with library-dependent settings while we could take care
> > of the use cases that matter the most with a simple grammar? Perhaps
> > "not_current" is not the best match as a keyword, we could just use
> > "collation" and handle that as a boolean. As long as we don't need
> > new operators in the grammar rules..
>
> I'm not sure what the DBA usual pattern here. If the reindexing
> runtime is really critical, I'm assuming that at least some people
> will dig into library details to see what are the collations that
> actually broke in the last upgrade and will want to reindex only
> those, and force the version for the rest of the indexes. And
> obviously, they probably won't wait to have multiple collation
> versions dependencies before taking care of that. In that case the
> filters that would matters would be one to only keep indexes with an
> outdated collation version, and an additional one for a specific
> collation name. Or we could have the COLLATION keyword without
> additional argument mean all outdated collations, and COLLATION
> 'collation_name' to specify a specific one. This is maybe a bit ugly,
> and would probably require a different approach for reindexdb.

Is this really a common enough operation that we need it i the main grammar?

Having the functionality, definitely, but what if it was "just" a
function instead? So you'd do something like:
SELECT 'reindex index ' || i FROM pg_blah(some, arguments, here)
\gexec

Or even a function that returns the REINDEX commands directly (taking
a parameter to turn on/off concurrency for example).

That also seems like it would be easier to make flexible, and just as
easy to plug into reindexdb?

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-12-15 17:36:36 Re: SQL/JSON: functions
Previous Message Pavel Stehule 2020-12-15 17:16:47 Re: On login trigger: take three