Re: [HACKERS] DROP TABLE inside a transaction block

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Philip Warner <pjw(at)rhyme(dot)com(dot)au>
Cc: Lamar Owen <lamar(dot)owen(at)wgcr(dot)org>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Mike Mascari <mascarm(at)mascari(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] DROP TABLE inside a transaction block
Date: 2000-03-08 06:54:52
Message-ID: 24306.952498492@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Philip Warner <pjw(at)rhyme(dot)com(dot)au> writes:
> For the ignorant, are you able to explain why naming files
> '<table_name>_<IOD>' is not acceptable? This seems to satisfy both
> requirements (and seemed to be the conclusion of the previous discussion).

Well, it's pretty simple: consider what has to happen to make RENAME
TABLE be rollback-able.

You clearly have to update the pg_class tuple whose relname field
contains the table name. That's no problem, because the normal
tuple commit mechanics will take care of making that tuple update
visible or not.

But, in the current implementation, renaming a table also requires
renaming the physical files that hold the table's data --- and last
I checked, Unix filesystems don't know anything about Postgres
transactions. Our current code renames the files instantly when
the table rename command is done, and there isn't any code for
undoing that rename. Thus, aborting the xact afterwards fails, because
the pg_class entries revert to their pre-xact values, but the physical
files don't revert to their prior names.

If we change the implementation so that the files are named after
the (fixed, never-changed-after-creation) table OID, then RENAME
TABLE is no problem: it affects *nothing* except the relname field
of the table's pg_class row, and either that row update is committed
or it ain't.

But if the physical file names contain the logical table name, we
have to be prepared to rename those files in sync with the transaction
commit that makes the pg_class update valid. Quite aside from any
implementation effort involved, the critical point is this: it is
*not possible* to ensure that that collection of changes is atomic.
At best, we can make the window for failure small.

Bruce seems to be willing to accept a window of failure for RENAME
TABLE in order to make database admin easier. That is very possibly
the right tradeoff --- but it is *not* an open-and-shut decision.
We need to talk about it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2000-03-08 07:06:00 Re: [HACKERS] DROP TABLE inside a transaction block
Previous Message Bruce Momjian 2000-03-08 06:41:15 Re: [HACKERS] DROP TABLE inside a transaction block