Allow replacement of bloated primary key indexes without foreign key rebuilds

From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Allow replacement of bloated primary key indexes without foreign key rebuilds
Date: 2012-07-07 03:53:12
Message-ID: CABwTF4UxTg+kERo1Nd4dt+H2miJoLPcASMFecS1-XHijABOpPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Today I learnt [1,2,3] that the feature ALTER TABLE .. ADD CONSTRAINT ...
USING INDEX we added back in the day is not so useful in the field. Having
to drop foreign key constraints before this command, and recreate them
afterwards makes this command useless to most database setups. I feel sorry
that no one brought this up when we were implementing the feature; maybe we
could've done something about it right then.

I wish to correct it now, and did some research. Long story short, I
realized that the foreign key constraint depends on the index relation of
the primary key, and does not depend on the constraint object as I had
expected (Please see rows 5 and 11 of the result set shown below). This
behaviour is also seen when the FKey references a unique constraint. As
much as that perplexes me, I think it makes our job a bit easier.

All we need to do is allow swapping of pg_class.relfilenode of two indexes.
This will let the dependency entries stand as they are and allow us to drop
the bloated primary key index structure without having to rebuild the
foreign key constraints.

As for the syntactical sugar, this can be added to either ALTER TABLE or to
ALTER INDEX. Although under no normal circumstances one would need to use
ALTER INDEX to swap two indexes' relfilenode (because one can easily create
a duplicate index and drop/rename-in-place the old one), I think it would
make more sense here since it is just an operation on two indexes and has
nothing to do with the constraints, apart from the fact that we want to use
this feature to meddle with the constraints.

Syntax options:

ALTER TABLE tbl REPLACE [CONSTRAINT constr] {PRIMARY KEY | UNIQUE} USING
INDEX new_index;

ALTER INDEX ind REPLACE WITH new_index;

Note that in both the syntaxes, it is assumed that all remnants of
new_index will be gone after the command completes successfully; that is,
the commands will behave as if they deleted the index structure of the
index being replaced and placed the new structure in its place, while
dropping the index that was used for the replacement.

I don't think we need to ensure that the new_index is completely flushed to
disk before the operation, but we do need to issue relevant cache
invalidation messages after the operation is done.

For replacement to be successful, new_index should not be associated with
any constraints, and, new_index should be identical to the index being
replaced, except for the index names.

The ALTER TABLE syntax closely emulates the existing syntax of replacing a
constraint using an existing index, but looking at the grammar construction
I feel that it may be more complex to implement than the ALTER INDEX syntax.

ALTER INDEX feels easier to do, since we won't have to jump through hoops
like in ALTER TABLE's multi-command support (ATExec*() functions), and
dropping the new_index might be easier to do.

Thoughts?

postgres=# \d+ test
Table "public.test"
Column | Type | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
a | integer | not null | plain | |
b | integer | | plain | |
Indexes:
"test_pkey" PRIMARY KEY, btree (a)
"temp_idx" UNIQUE, btree (a)
Referenced by:
TABLE "test2" CONSTRAINT "test2_b_fkey" FOREIGN KEY (b) REFERENCES
test(a)
Has OIDs: no

postgres=# \d+ test2
Table "public.test2"
Column | Type | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
a | integer | | plain | |
b | integer | | plain | |
Foreign-key constraints:
"test2_b_fkey" FOREIGN KEY (b) REFERENCES test(a)
Has OIDs: no

Relevant output of query [4] on pg_depend:

classid | objid | objid | refclassid |
refobjid | refobjid | deptype
---------------+-------+------------------------------+---------------+----------+--------------+---------
pg_class | 16413 | test | pg_namespace |
2200 | public | n
pg_type | 16415 | test | pg_class |
16413 | test | i
pg_type | 16414 | test[] | pg_type |
16415 | test | i
pg_constraint | 16417 | test_pkey | pg_class |
16413 | test | a
pg_class | 16416 | test_pkey | pg_constraint |
16417 | test_pkey | i
pg_class | 16418 | test2 | pg_namespace |
2200 | public | n
pg_type | 16420 | test2 | pg_class |
16418 | test2 | i
pg_type | 16419 | test2[] | pg_type |
16420 | test2 | i
pg_constraint | 16421 | test2_b_fkey | pg_class |
16413 | test | n
pg_constraint | 16421 | test2_b_fkey | pg_class |
16418 | test2 | a
pg_constraint | 16421 | test2_b_fkey | pg_class |
16416 | test_pkey | n
pg_trigger | 16422 | RI_ConstraintTrigger_a_16422 | pg_constraint |
16421 | test2_b_fkey | i
pg_trigger | 16423 | RI_ConstraintTrigger_a_16423 | pg_constraint |
16421 | test2_b_fkey | i
pg_trigger | 16424 | RI_ConstraintTrigger_c_16424 | pg_constraint |
16421 | test2_b_fkey | i
pg_trigger | 16425 | RI_ConstraintTrigger_c_16425 | pg_constraint |
16421 | test2_b_fkey | i

[1] http://archives.postgresql.org/pgsql-general/2012-07/msg00104.php
[2] http://archives.postgresql.org/pgsql-general/2012-07/msg00105.php
[3] http://archives.postgresql.org/pgsql-general/2012-07/msg00110.php
[4] select classid::regclass, objid, case classid::regclass::text when
'pg_class' then objid::regclass::text when 'pg_type' then
objid::regtype::text when 'pg_constraint' then (select conname from
pg_constraint where oid = objid) when 'pg_namespace' then (select nspname
from pg_namespace where oid = objid) when 'pg_trigger' then (select tgname
from pg_trigger where oid = objid) else objid::text end,
refclassid::regclass, refobjid, case refclassid::regclass::text when
'pg_class' then refobjid::regclass::text when 'pg_type' then
refobjid::regtype::text when 'pg_constraint' then (select conname from
pg_constraint where oid = refobjid) when 'pg_namespace' then (select
nspname from pg_namespace where oid = refobjid) when 'pg_trigger' then
(select tgname from pg_trigger where oid = refobjid) else refobjid::text
end, deptype from pg_depend;

--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2012-07-07 09:36:41 Re: Bug tracker tool we need
Previous Message Robert Haas 2012-07-07 02:13:24 Re: Bug tracker tool we need