Re: Progress report of CREATE INDEX for nested partitioned tables

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ilya Gladyshev <ilya(dot)v(dot)gladyshev(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Progress report of CREATE INDEX for nested partitioned tables
Date: 2023-03-25 02:53:19
Message-ID: ZB5iH14191E0LbBb@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 23, 2023 at 04:35:46PM -0400, Tom Lane wrote:
> So I'm still pretty desperately unhappy with count_leaf_partitions.
> I don't like expending significant cost purely for progress tracking
> purposes, I don't like the undocumented assumption that NoLock is
> safe, and what's more, if it is safe then we've already traversed
> the inheritance tree to lock everything so in principle we could
> have the count already. However, it does seem like getting that
> knowledge from point A to point B would be a mess in most places.
>
> One thing we could do to reduce the cost (and improve the safety)
> is to forget the idea of checking the relkinds and just set the
> PARTITIONS_TOTAL count to list_length() of the find_all_inheritors
> result.

Actually list_length() minus 1 ...

> We've already agreed that it's okay if the PARTITIONS_DONE
> count never reaches PARTITIONS_TOTAL, so this would just be taking
> that idea further. (Or we could increment PARTITIONS_DONE for
> non-leaf partitions when we visit them, thus making that TOTAL
> more nearly correct.)

Yes, I think that's actually more correct. If TOTAL is set without
regard to relkind, then DONE ought to be set the same way.

I updated the documentation to indicate that the counters include the
intermediate partitioned rels, but I wonder if it's better to say
nothing and leave that undefined.

> Furthermore, as things stand it's not hard
> for PARTITIONS_TOTAL to be zero --- there's at least one such case
> in the regression tests --- and that seems just weird to me.

I don't know why it'd seem weird. postgres doesn't create partitions
automatically, so by default there are none. If we create a table but
never load any data, it'll have no partitions. Also, the TOTAL=0 case
won't go away just because we start counting intermediate partitions.

> By the by, this is awful code:
>
> + if (RELKIND_HAS_STORAGE(get_rel_relkind(partrelid)))
>
> Consult the definition of RELKIND_HAS_STORAGE to see why.
> But I want to get rid of that rather than fixing it.

Good point, but I'd burden-shift the blame to RELKIND_HAS_STORAGE().

BTW, I promoted myself to a co-author of the patch. My interest here is
to resolve this hoping to allow the CIC patch to progress.

--
Justin

Attachment Content-Type Size
0001-fix-CREATE-INDEX-progress-report-with-nested-partiti.patch text/x-diff 11.9 KB
0002-assertions-for-progress-reporting.patch text/x-diff 5.1 KB
0003-f-also-assert-that-progress-values-don-t-go-backward.patch text/x-diff 7.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-03-25 02:56:22 Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
Previous Message Michael Paquier 2023-03-25 02:50:50 Re: Generate pg_stat_get_xact*() functions with Macros