Re: 回复:how to create index concurrently on partitioned table

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, 李杰(慎追) <adger(dot)lj(at)alibaba-inc(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: 回复:how to create index concurrently on partitioned table
Date: 2020-09-04 00:51:13
Message-ID: 20200904005113.GC19499@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 03, 2020 at 10:02:58PM +0300, Anastasia Lubennikova wrote:
> First of all, this patch fails at cfbot:
>
> indexcmds.c:2848:7: error: variable ‘parentoid’ set but not used
> [-Werror=unused-but-set-variable]
> Oid parentoid;^

Missed this warning, thanks for pointing it out. This is an incorrect
rebase: this variable was used as a workaround to take a session lock
on the parent table to be reindexed, session lock logic existing to
prevent cache lookup errors when dropping some portions of the tree
concurrently (1d65416 as you already know). But we don't need that
anymore.

> If this guessed fix is correct, I see the problem in the patch logic. In
> reindex_partitions() we collect parent relations to pass them to
> reindex_multiple_internal(). It implicitly changes the logic from REINDEX
> INDEX to REINDEX RELATION, which is not the same, if table has more than one
> index.

Incorrect guess here. parentoid refers to the parent table of an
index, so by saving its OID in the list of things to reindex, you
would finish by reindexing all the indexes of a partition. We need to
use partoid, as returned by find_all_inheritors() for all the members
of the partition tree (relid can be a partitioned index or partitioned
table in reindex_partitions).

> 1) in create_index.sql
>
> Are these two lines intentional checks that must fail? If so, I propose to
> add a comment.
>
> REINDEX TABLE concur_reindex_part_index;
> REINDEX TABLE CONCURRENTLY concur_reindex_part_index;
>
> A few lines around also look like they were copy-pasted and need a second
> look.

You can note some slight differences though. These are test cases for
REINDEX INDEX with tables, and REINDEX TABLE with indexes. What you
are quoting here is the part for indexes with REINDEX TABLE. And
there are already comments, see:
"-- REINDEX INDEX fails for partitioned tables"
"-- REINDEX TABLE fails for partitioned indexes"

Perhaps that's not enough, so I have added some more comments to
outline that these commands fail (8 commands in total).

> 2)  This part of ReindexRelationConcurrently() is misleading.
>
> Maybe assertion is enough. It seems, that we should never reach this code
> because both ReindexTable and ReindexIndex handle those relkinds
> separately.  Which leads me to the next question.

Yes, we could use an assert, but I did not feel any strong need to
change that either for this patch.

> 3) Is there a reason, why reindex_partitions() is not inside
> ReindexRelationConcurrently() ? I think that logically it belongs there.

Yes, it simplifies the build of the index list indexIds, as there is
no need to loop back into a different routine if working on a table or
a matview.

> 4) I haven't tested multi-level partitions yet. In any case, it would be
> good to document this behavior explicitly.

Not sure what addition we could do here. The patch states that each
partition of the partitioned relation defined gets reindexed, which
implies that this handles multiple layers automatically.

> I need a bit more time to review this patch more thoroughly. Please, wait
> for it, before committing.

Glad to hear that, please take the time you need.
--
Michael

Attachment Content-Type Size
reindex-part-v9.patch text/x-diff 40.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2020-09-04 00:53:43 Re: Disk-based hash aggregate's cost model
Previous Message Tom Lane 2020-09-04 00:15:32 Re: proposal: possibility to read dumped table's name from file