PROPOSAL: Support global and local disabling of indexes

From: Paul Martinez <hellopfm(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: PROPOSAL: Support global and local disabling of indexes
Date: 2022-03-18 06:16:24
Message-ID: CAF+2_SEjtwnh1AEn4HvVfVNbiHRwu=H70ENey-WTgnpkGwOk1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hey, hackers,

Adding and removing indexes is a regular part of database maintenance,
but in a large database, removing an index can be a very risky operation.
Removing the wrong index could have disastrous consequences for
performance, and it could take tens of minutes, or even hours, to rebuild
the entire index.

I propose adding an ALTER INDEX command that can enable or disable an
index on a global level:

ALTER INDEX index_name ENABLE;
ALTER INDEX index_name DISABLE;

A disabled index is still updated, and still enforces constraints, but it
will not be used for queries.

Whether or not the index is disabled could also be specified at index
creation:

CREATE INDEX index_name ON table_name (col1, col2) ENABLED; -- default
CREATE INDEX index_name ON table_name (col1, col2) DISABLED;

This would be useful if a user anticipates index creation to take a long
time and they want to be able to carefully monitor the database once the
index starts getting used.

It would also be useful to be able to enable and disable indexes locally
in the context of a single session to easily and safely verify that a
query can still be executed efficiently without an index:

ALTER INDEX index_name DISABLE SESSION;

It might also be reasonable to "unset" any local override to what's
actually set on the index itself, but this would probably require slightly
different syntax: SET ENABLED = true / false / DEFAULT maybe?

I am unsure of how a user would query this information; maybe a function
like pg_disabled_index_overrides() ? The permanent state of an index
should be reflected in the output of \d <table> by appending 'DISABLED'
to disabled indexes.

The pg_index catalog entry currently includes a column `indisvalid` which
prevents queries from using the index, and this column can be set
explicitly, though not easily (it requires getting the oid of the index
relation from pg_class), and presumably not entirely safely. This column
contains significant semantic information about the state of the index, so
I don't think it makes sense to burden it with additional meaning that is
entirely user-dependent.

Supporting global enabling/disabling of an index could be accomplished
fairly simply by adding a `indisenabled` boolean flag to pg_index.
Updating this value would acquire an AccessExclusive lock on the index and
call an updated version of index_set_state_flags, which automatically
handles sending the cache invalidation message to other processes.
(Is this sufficient to also invalidate all cached query plans?)

The actual "disabling" part can be handled by adding disable_cost inside
the cost_index function in costsize.c, similar to how enable_indexscan is
handled.

Supporting session-local enabling/disabling of indexes is trickier. We can
keep track of the manual overrides in the backend process's local memory
as a very light-weight option. (A simple linked list would suffice.) But
we have to take extra care to keep this up-to-date. When an index is
dropped, any local overrides need to be dropped. It probably also makes
sense to mimic the behavior of SET SESSION, which will rollback any
changes made during a transaction if the transaction rolls back. (And if
we handle this, maybe it makes sense to support ENABLE / DISABLE LOCAL as
analogues of SET LOCAL as well.)

To handle persisting/rolling back changes we can add a new AtEOXact
function that gets called at the end of CommitTransaction and
AbortTransaction.

I'm less sure how to handle deleting entries when indexes are deleted by
other transactions (or especially by the same transaction). Could we use
CacheRegisterRelcacheCallback to be notified anytime the relcache is
updated and make sure all the indexes we have overrides for still exist?
When would that callback be executed relative to our own process? If the
backend isn't in a transaction, it would have to check for deleted indexes
right away, but if it is, we would have to wait for the end of the
transaction to update our list (possibly a job for
AtEOXact_UpdateDisabledIndexes?) Are they other parts of Postgres that
behave similarly?

A more heavy-weight option would be to actually store this info in a
catalog table, but that would add a lot of overhead to cost estimation
during query planning, so I don't think it's a great option.

Does this sound like a reasonable feature to add to Postgres? I feel like
it would make it a lot easier to manage large databases and debug some
query performance problems. There are definitely some details to iron out,
like the exact syntax, and a lot of implementation details I'm unsure of,
but if people support it I'd be glad to try to implement it.

Thanks!
Paul

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-03-18 06:33:05 Re: PROPOSAL: Support global and local disabling of indexes
Previous Message Michael Paquier 2022-03-18 06:13:05 Re: Tab completion for ALTER MATERIALIZED VIEW ... SET ACCESS METHOD