Skip site navigation (1) Skip section navigation (2)

Re: Proposal for DROP TABLE rollback mechanism

From: "Vadim Mikheev" <vmikheev(at)sectorbase(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for DROP TABLE rollback mechanism
Date: 2000-10-28 17:38:22
Message-ID: 006501c04105$debc53e0$ (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
> 1. smgrcreate() will create the file for the relation same as it does now,
> and will add the rel's RelFileNode information to an smgr-private list of
> rels created in the current transaction.
> 2. smgrunlink() normally will NOT immediately delete the file; instead it
> will perform smgrclose() and then add the rel's RelFileNode information to
Seems that smgrclose still need in Relation (where rd_fd lives and this is
bad) and you're going to put just file node to smgrunlink (below). Shouldn't
relation be removed from relcache (and smgrclose called from there)
before smgrunlink?

> an smgr-private list of rels to be deleted at commit.  However, if the
> file appears in the list created by smgrcreate() --- ie, the rel is being
> created and deleted in the same xact --- then we can delete it
> immediately.  In this case we remove the file from the smgrcreate list
> and do not put it on the unlink list.

(This wouldn't work for savepoints but ok for now.)

> 3. smgrcommit() will delete all the files mentioned in the list created
> by smgrunlink, then discard both lists.
> 4. smgrabort() will delete all the files mentioned in the list created
> by smgrcreate, then discard both lists.
> Points 1 and 4 will replace the existing relcache-based mechanism for
> deleting files created in the current xact when the xact aborts.
> Various details:
> To support deleting files at xact commit/abort, we will need something
> like an "mdblindunlink" entrypoint to md.c.  I am inclined to simply
> redefine mdunlink to take a RelFileNode instead of a complete Relation,
> rather than supporting two entrypoints --- I don't think there'll be any
> future use for the existing mdunlink.  Objections?

No one. Actually I would like to see all smgr entries taking RelFileNode
instead of Relation. This would require smgr cache to map nodes to fd-es.
Having this we could get rid of all blind smgr entries: smgropen
would put file node & fd into smgr cache and all other smgrmethods
would act as blind ones do now - no file node found in cache then
open file, performe op, close file.
But probably it's too late to implement this now.

> bufmgr.c's ReleaseRelationBuffers drops any dirty buffers for the target
> rel, and therefore it must NOT be called inside the transaction (else,
> rollback would mean we'd lost data).  I am inclined to let it continue
> to behave that way, but to call it from smgrcommit/smgrabort, not from
> anywhere else.  This would mean changing its API to take a RelFileNode,
> but that seems no big problem.  This way, dirty buffers for a doomed
> relation will be allowed to live until transaction commit, in the hopes
> that we will be able to discard them unwritten.

Mmmm, why not call FlushRelationBuffers? Calling bufmgr from smgr
doesn't look like right thing. ?

> Will remove notices in DROP TABLE etc. warning that these operations
> are not rollbackable.  Note that CREATE/DROP DATABASE is still not
> rollback-able, and so those two ops will continue to elog(ERROR) when
> called in a transaction block.  Ditto for VACUUM; probably also ditto
> for REINDEX, though I haven't looked closely at that yet.
> The temp table name mapper will need to be modified so that it can
> undo all current-xact changes to its name mapping list at xact abort.
> Currently I think it only handles undoing additions, not
> deletions/renames.  This does not need to be WAL-aware, does it?
> WAL:
> AFAICS, things will behave properly if calls to smgrcreate/smgrunlink
> are logged as WAL events.  For redo, they are executed just the same

Yes, there will be logging for smgrcreate, but not for smgrunlink because
of we'll make real unlinking after commit. All what is needed for WAL
is list of file nodes to remove - I need to put this list into commit log
record to ensure that files are removed on redo and need in ability
to remove file immediately in this case.

> as normal, except they shouldn't complain if the target file already
> exists (or already doesn't exist, for unlink).  Undo of smgrcreate
> is just immediate mdunlink; undo of smgrunlink is a no-op.
> I have not studied the WAL code enough to be prepared to add the
> logging/undo/redo code, and it looks like you haven't implemented that
> anyway yet for smgr.c, so I will leave that part to you, OK?

Unfortunately, there will be no undo in 7.1 -:(
I've found that to undo index updates we would need in either compensation
records or in xmin/cmin in index tuples. So, we'll still live with dust in
storage -:(
Redo is much easy.


In response to


pgsql-hackers by date

Next:From: Tom LaneDate: 2000-10-28 17:52:26
Subject: Re: Proposal for DROP TABLE rollback mechanism
Previous:From: Timothy H. KeittDate: 2000-10-28 16:59:06
Subject: Location of client header files

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group