Re: REINDEX backend filtering

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: REINDEX backend filtering
Date: 2021-03-14 11:54:11
Message-ID: YE35Y//jLIb7gFLu@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 14, 2021 at 04:10:07PM +0800, Julien Rouhaud wrote:
> v6 attached, rebase only due to conflict with recent commit.

I have read through the patch.

+ bool outdated_filter = false;
Wouldn't it be better to rename that "outdated" instead for
consistency with the other options?

In ReindexRelationConcurrently(), there is no filtering done for the
index themselves. The operation is only done on the list of indexes
fetched from the parent relation. Why? This means that a REINDEX
(OUTDATED) INDEX would actually rebuild an index even if this index
has no out-of-date collations, like a catalog. I think that's
confusing.

Same comment for the non-concurrent case, as of the code paths of
reindex_index().

Would it be better to inform the user which indexes are getting
skipped in the verbose output if REINDEXOPT_VERBOSE is set?

+ <para>
+ Check if the specified index has any outdated dependency. For now only
+ dependency on collations are supported.
+ </para></entry>
[...]
+ <term><literal>OUTDATED</literal></term>
+ <listitem>
+ <para>
+ This option can be used to filter the list of indexes to rebuild and only
+ process indexes that have outdated dependencies. Fow now, the only
+ handle dependency is for the collation provider version.
+ </para>
Do we really need to be this specific in this part of the
documentation with collations? The last sentence of this paragraph
sounds weird. Don't you mean instead to write "the only dependency
handled currently is the collation provider version"?

+\set VERBOSITY terse \\ -- suppress machine-dependent details
+-- no suitable index should be found
+REINDEX (OUTDATED) TABLE reindex_coll;
What are those details? And wouldn't it be more stable to just check
after the relfilenode of the indexes instead?

" ORDER BY sum(ci.relpages)"
Schema qualification here, twice.

+ printf(_(" --outdated only process indexes
having outdated depencies\n"));
s/depencies/dependencies/.

+ rel = try_relation_open(indexOid, AccessShareLock);
+
+ if (rel == NULL)
+ PG_RETURN_NULL();
Let's introduce a try_index_open() here.

What's the point in having both index_has_outdated_collation() and
index_has_outdated_collation()?

It seems to me that 0001 should be split into two patches:
- One for the backend OUTDATED option.
- One for pg_index_has_outdated_dependency(), which only makes sense
in-core once reindexdb is introduced.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-03-14 11:59:08 Re: Fallback table AM for relkinds without storage
Previous Message Fabien COELHO 2021-03-14 10:17:56 Re: pgbench - add pseudo-random permutation function