Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, maximilian(dot)chrzan(at)here(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-09-15 16:15:26
Message-ID: 618784.1757952926@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I'll take the risk of expressing an opinion: I think we should do
> something about this problem, but I'm not all that convinced we should
> do this particular thing.

Thanks for weighing in! Opinions are good, and I'm by no means set
on doing what is in my submitted patch. However, I'm failing to
extract a clearly-better alternative from your message.

> There are two things about this that don't seem great to me. First,
> the index names for idxpart1 are no longer consistent with the index
> names for idxpart2. Second, IMHO, the names are worse. Let's talk
> about each of those problems separately. The reason the names aren't
> consistent any more is because idxpart2 is created as a standalone
> table and then attached as a partition, whereas idxpart1 is a
> partition from the first moment of its existence. It is not essential
> that the index names not vary based on which way the user does it, but
> I think it is a nicer user experience if they don't.

I think I'm not getting something here. Isn't that inconsistency
directly traceable to the user having supplied inconsistent names
to begin with? Surely we're not going to get into the business of
overruling the user's choice of names, so it seems to me that some
cases like this are inevitable. The patch does change which cases
those are, but I don't see how we avoid all such changes except by
sitting on the current rules forever. Maybe what your complaint
really points to is that this regression test case is designed around
the old naming conventions, and we ought to do more-extensive surgery
on it so that the names that are test-script-determined are consistent
with the new approach.

> Now, what about
> the absolute quality of the names? The change makes the index names
> more consistent with the name of the index on the parent, which is
> nice, but we also lose something: the index names are now less
> consistent with the names of the child tables, and they don't mention
> the affected columns any more.

Well, again, that's a user decision.

> Surely, it's not as nice for the indexes on the brassica and daucus
> tables to be named vegetables_id_idx_1 and vegetables_id_idx_2 rather
> than brasica_id_ix and daucus_id_idx. That's just gotta be worse.

I don't really agree. The only thing the overexplain test script
does to create these indexes is

CREATE INDEX ON vegetables (id);

I don't see why it's even slightly surprising that the resulting
child indexes should have names involving "vegetables" and "id".
If anything, I'd argue that the current behavior is more surprising,
even if we've grown used to it.

> The fact that you could still get it the other way if you created the
> partitions standalone and then attached them makes it even worse.

True, but I think people who didn't like that would soon adapt their
choices of index names.

> IMHO, the real problem here is that when an index is created on a
> column, we have this idea (with which I agree) that it would be nice
> to include the column name in the index, but when we have an
> expression we go "oh, rats, there's no column name, I guess we'll just
> use 'expr'", which doesn't scale very well beyond a single expression
> index.

Fair complaint, but I'm not hearing a workable proposal for
something better to do with expression indexes. This:

> One way forward could be to do some sort of hash for the plan tree and
> instead of saying "expr", say "exprXXXX" where each X is an integer or
> a hex digit or something.

would yield names that are neither intelligible nor readily
distinguishable from each other. Trying to make them stable across
system changes seems like a doomed project as well.

I do like the idea of pulling out function and variable names from
the index's expression. That won't get us all the way to unique
names, but we have to have a rule for fixing duplicate names anyway
(since you can make more than one plain index on the same column).
So we could do whatever seems sensible for deriving an
expression-index name and then deal with remaining duplications
the same way as for non-expression indexes.

In any case, the issues around expression-index names feel like
an orthogonal problem to me; I don't agree that a fix for that
would remove the need to do the sorts of things I'm suggesting.

So, how do we move forward? I'm perfectly willing to look into
the derive-a-name-from-the-expression idea, but I think that'd
best be done in a separate patch.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jan Behrens 2025-09-15 17:24:17 Re: BUG #19053: Inconsistent arithmetic regarding TIMESTAMPTZ and INTERVAL
Previous Message ocean_li_996 2025-09-15 16:03:42 Re:BUG #19053: Inconsistent arithmetic regarding TIMESTAMPTZ and INTERVAL

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Mityugov 2025-09-15 16:23:30 Re: --with-llvm on 32-bit platforms?
Previous Message Maxim Orlov 2025-09-15 15:42:00 Re: POC: make mxidoff 64 bits