Re: Proposal: Local indexes for partitioned table

From: Maksim Milyutin <milyutinma(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Maksim Milyutin <m(dot)milyutin(at)postgrespro(dot)ru>
Subject: Re: Proposal: Local indexes for partitioned table
Date: 2017-10-07 16:15:46
Message-ID: 5d082f83-1758-df56-a6a4-47439765048f@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

07.10.17 16:34, Robert Haas wrote:

> On Fri, Oct 6, 2017 at 12:37 PM, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> One thing I'm a bit worried about is how to name these subordinate
> indexes. They have to have names because that's how pg_class works,
> and those names can't all be the same, again because that's how
> pg_class works. There's no problem right away when you first create
> the partitioned index, because you can just pick names out of a hat
> using whatever name-generating algorithm seems best. However, when
> you dump-and-restore (including but not limited to the pg_upgrade
> case) you've got to preserve those names. If you just generate a new
> name that may or may not be the same as the old one, then it may
> collide with a user-specified name that only occurs later in the dump.
> Also, you'll have trouble if the user has applied a COMMENT or a
> SECURITY LABEL to the index because that command works by name, or if
> the user has a reference to the index name inside a function or
> whatever.
>
> These are pretty annoying corner-case bugs because they're not likely
> to come up very often. Most people won't notice or care if the index
> name changes. But I don't think it's acceptable to just ignore the
> problem. An idea I had was to treat the abstract index - to use your
> term - sort of the way we treat an extension. Normally, when you
> create an index on a partitioned table, it cascades, but for dump and
> restore purpose, we tag on some syntax that says "well, don't actually
> create the subordinate indexes, i'll tell you about those later".
> Then for each subordinate index we issue a separate CREATE INDEX
> command followed by ALTER INDEX abstract_index ATTACH PARTITION
> concrete_index or something of that sort. That means you can't
> absolutely count on the parent index to have all of the children it's
> supposed to have but maybe that's OK.

AFAICS, the main problem with naming is generating new unique names for
subordinate indexes on the stage of migrating data scheme (pg_dump,
pg_upgrade, etc). And we cannot specify these names in the 'CREATE INDEX
partitioned_index' statement therefore we have to regenerate their.

In this case I propose to restore index names' hierarchy *bottom-up*,
i.e. first of all create indexes for the leaf partitions and then create
ones for parents up to root explicitly specifying names. When creating
index on parent table we have to check is there exist any index on child
table that could be child index (identical criteria). If so, not
generate new index but implicitly attach that index into parent one.
If we have incomplete index hierarchy, e.g. we dropped some indexes of
partitions previously, then recreating of parent's index would
regenerate (not attach) indexes for those partitions. We could drop
those odd generated indexes after building of parent's index. This
decision is not straightforward but provides to consider 'CREATE INDEX
paritioned_table' statement as a cascade operation.
As a result, we can specify name for each concrete index while
recreating a whole hierarchy.

--
Regards,
Maksim Milyutin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-10-07 16:19:01 Re: Discussion on missing optimizations
Previous Message Tom Lane 2017-10-07 16:15:30 Re: Discussion on missing optimizations