Re: 回复:how to create index concurrently on partitioned table

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: 李杰(慎追) <adger(dot)lj(at)alibaba-inc(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: 回复:how to create index concurrently on partitioned table
Date: 2020-08-09 23:44:23
Message-ID: 20200809234423.GH20473@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for looking.

On Sun, Aug 09, 2020 at 02:00:09PM +0900, Michael Paquier wrote:
> > exactly what's needed.
>
> For now, I would recommend to focus first on 0001 to add support for
> partitioned tables and indexes to REINDEX. CIC is much more
> complicated btw, but I am not entering in the details now.
>
> + /* Avoid erroring out */
> if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
> {
> This comment does not help, and actually this becomes incorrect as
> reindex for this relkind becomes supported once 0001 is done.

I made a minimal change to avoid forgetting to eventually change that part.

> ReindexPartitionedRel() fails to consider the concurrent case here for
> partition indexes and tables, as reindex_index()/reindex_relation()
> are the APIs used in the non-concurrent case. Once you consider the
> concurrent case correctly, we also need to be careful with partitions
> that have a temporary persistency (note that we don't allow partition
> trees to mix persistency types, all partitions have to be temporary or
> permanent).

Fixed.

> I think that you are right to make the entry point to handle
> partitioned index in ReindexIndex() and partitioned table in
> ReindexTable(), but the structure of the patch should be different:
> - The second portion of ReindexMultipleTables() should be moved into a
> separate routine, taking in input a list of relation OIDs. This needs
> to be extended a bit so as reindex_index() gets called for an index
> relkind if the relpersistence is temporary or if we have a
> non-concurrent reindex. The idea is that we finish with a single code
> path able to work on a list of relations. And your patch adds more of
> that as of ReindexPartitionedRel().

It's a good idea.

> - We should *not* handle directly partitioned index and/or table in
> ReindexRelationConcurrently() to not complicate the logic where we
> gather all the indexes of a table/matview. So I think that the list
> of partition indexes/tables to work on should be built directly in
> ReindexIndex() and ReindexTable(), and then this should call the
> second part of ReindexMultipleTables() refactored in the previous
> point.

I think I addressed these mostly as you intended.

> This way, each partition index gets done individually in its
> own transaction. For a partition table, all indexes of this partition
> are rebuilt in the same set of transactions. For the concurrent case,
> we have already reindex_concurrently_swap that it able to switch the
> dependencies of two indexes within a partition tree, so we can rely on
> that so as a failure in the middle of the operation never leaves the
> a partition structure in an inconsistent state.

--
Justin

Attachment Content-Type Size
v5-0001-Implement-REINDEX-of-partitioned-tables-indexes.patch text/x-diff 15.1 KB
v5-0002-Allow-CREATE-INDEX-CONCURRENTLY-on-partitioned-ta.patch text/x-diff 12.0 KB
v5-0003-Implement-CLUSTER-of-partitioned-table.patch text/x-diff 10.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message movead.li@highgo.ca 2020-08-10 01:52:59 Re: POC and rebased patch for CSN based snapshots
Previous Message Tom Lane 2020-08-09 16:33:43 Re: tab completion of IMPORT FOREIGN SCHEMA