I don't want to back up index files

From: Glen Parker <glenebob(at)nwlink(dot)com>
To: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: I don't want to back up index files
Date: 2009-03-11 01:54:30
Message-ID: 49B719D6.70209@nwlink.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I am wondering the feasibility of having PG continue to work even if
non-essential indexes are gone or corrupt. I brought this basic concept
up at some point in the past, but now I have a different motivation, so
I want to strike up discussion about it again. This time around, I
simply don't want to back up indexes if I don't have to. Because
indexes contain essentially redundant data, losing one does not equate
to losing real data. Therefore, backing them up represents a lot of
overhead for very little benefit.

Here's the basic idea:

1) New field to pg_index (indvalid boolean).
2) Query planner skips indexes where indvalid = false.
3) Executer does not update indexes where indvalid = false.
4) Executer refuses insert or update to unique columns where indvalid =
false, throwing an error.
5) WAL roll forward marks indvalid = false if index file(s) are missing,
rather than panicking.
6) REINDEX recognizes syntax to only build indexes with indvalid =
false, marks indvalid = true.

Close to 25% of the on disk bulk of my database is index files. It
would save a significant amount of the system resources used during the
backup, if I didn't have to archive the index files. In the unlikely
event that a restore/roll forward becomes necessary, I could simply
issue something like "REINDEX DATABASE foo INVALID;" to restore all the
missing indexes and return the database to full function. Prior to a
reindex, the database would perform poorly and refuse to do certain
inserts and updates, but the data would be available. Backup files
would be smaller, and the restore/roll forward would be faster.

No down sides jump out at me, and it seems to me that for a regular PG
code hacker this could actually be fairly simple to implement.

Any chance of something like this being done in the future?

-Glen

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joseph S 2009-03-11 04:11:38 ERROR: table row type and query-specified row type do not match
Previous Message Glen Parker 2009-03-11 01:42:53 I don't want to back up index files