Re: 回复:how to create index concurrently on partitioned table

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: 李杰(慎追) <adger(dot)lj(at)alibaba-inc(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: 回复:how to create index concurrently on partitioned table
Date: 2020-09-08 04:31:05
Message-ID: 20200908043105.GE19261@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 07, 2020 at 09:39:16PM -0500, Justin Pryzby wrote:
> Also, my previous revision failed to implement your suggestion to first build
> catalog entries with INVALID indexes and to then reindex them. Fixed.

- childStmt->oldCreateSubid = InvalidSubTransactionId;
- childStmt->oldFirstRelfilenodeSubid = childStmt->InvalidSubTransactionId;
+ // childStmt->oldCreateSubid = childStmt->InvalidSubTransactionId;
+ // childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId;
In the CIC patch, what is that about? It is hard to follow what you
are trying to achieve here.

+ index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+ CommandCounterIncrement();
+ index_set_state_flags(indexRelationId, INDEX_CREATE_SET_VALID);
Anyway, this part of the logic is still not good here. If we fail in
the middle here, we may run into problems with a single partition
index that has only a portion of its flags set. As this gets called
for each partitioned index, it also means that we could still finish
in a state where we have only a partially-valid partition tree. For
example, imagine a partition tree with 2 levels (as reported by
pg_partition_tree), then the following happens if an index is created
concurrently from the partitioned table of level 0:
1) indexes are created in level 2 first
2) partitioned table of level 1 is switched to be ready and valid
3) indexes of level 1 are created.
4) partitioned table of level 0 is switched to be ready and valid
If the command has a failure, say between 2 and 3, we would have as
result a command that has partially succeeded in creating a partition
tree, while the user was looking for an index for the full tree. This
comes back to my previous points, where we should make
index_set_state_flags() first, and I have begun a separate thread
about that:
https://commitfest.postgresql.org/30/2725/

Second, it also means that the patch should really switch all the
indexes to be valid in one single transaction, and that we also need
more careful refactoring of DefineIndex().

I also had a quick look at the patch for CLUSTER, and it does a much
better job, still it has issues.

+ MemoryContext old_context = MemoryContextSwitchTo(cluster_context);
+
+ inhoids = find_all_inheritors(indexOid, NoLock, NULL);
+ foreach(lc, inhoids)
The area where a private memory context is used should be minimized.
In this case, you just need to hold the context while saving the
relation and clustered index information in the list of RelToCluster
items. As a whole, this case is more simple than CIC, so I'd like to
think that it would be good to work on that as next target.

Coming to my last point.. This thread has dealt since the beginning
with three different problems:
- REINDEX on partitioned relations.
- CLUSTER on partitioned relations.
- CIC on partitioned relations. (Should we also care about DROP INDEX
CONCURRENTLY as well?)

The first problem has been solved, not the two others yet. Do you
think that it could be possible to begin two new separate threads for
the remaining issues, with dedicated CF entries? We could also mark
the existing one as committed, retitled for REINDEX as a matter of
clarity. Also, please note that I am not sure if I will be able to
look more at this thread in this CF.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-09-08 04:44:17 Re: Improving connection scalability: GetSnapshotData()
Previous Message Ian Barwick 2020-09-08 04:23:01 Re: Improving connection scalability: GetSnapshotData()