Re: Proposal for DROP TABLE rollback mechanism

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Vadim Mikheev" <vmikheev(at)sectorbase(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal for DROP TABLE rollback mechanism
Date: 2000-10-28 17:52:26
Message-ID: 8092.972755546@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Vadim Mikheev" <vmikheev(at)sectorbase(dot)com> writes:
>> 2. smgrunlink() normally will NOT immediately delete the file; instead it
>> will perform smgrclose() and then add the rel's RelFileNode information to
> ^^^^^^^^^^^^^^^
> Seems that smgrclose still need in Relation (where rd_fd lives and this is
> bad) and you're going to put just file node to smgrunlink (below). Shouldn't
> relation be removed from relcache (and smgrclose called from there)
> before smgrunlink?

No, the way the existing higher-level code works is first to call
smgrunlink (NOT smgrclose) and then remove the relcache entry. I don't
see a need to change that. In the interval where we're waiting to
commit, there will be no relcache entry, only a RelFileNode value
sitting in smgr's list of files to delete at commit.

> Actually I would like to see all smgr entries taking RelFileNode
> instead of Relation.

I agree, but I think that's a project for a future release. Not enough
time for it for 7.1.

>> bufmgr.c's ReleaseRelationBuffers drops any dirty buffers for the target
>> rel, and therefore it must NOT be called inside the transaction (else,
>> rollback would mean we'd lost data). I am inclined to let it continue
>> to behave that way, but to call it from smgrcommit/smgrabort, not from
>> anywhere else. This would mean changing its API to take a RelFileNode,
>> but that seems no big problem. This way, dirty buffers for a doomed
>> relation will be allowed to live until transaction commit, in the hopes
>> that we will be able to discard them unwritten.

> Mmmm, why not call FlushRelationBuffers? Calling bufmgr from smgr
> doesn't look like right thing. ?

Yes, it's a little bit ugly, but if we call FlushRelationBuffers then we
will likely be doing some useless writes (to flush out pages that we are
only going to throw away anyway). If we leave the buffers alone till
commit, then we'll only write out pages if we need to recycle a buffer
for another use during that transaction.

Also, I don't feel comfortable with the idea of doing
FlushRelationBuffers mid-transaction and then relying on the buffer
cache to still be empty of pages for that relation much later on when
we finally commit. Sure, it *should* be empty, but I'll be happier
if we flush the buffer cache immediately before deleting the file.

What might make sense is to make a pass over the buffer cache at the
time of DROP (inside the transaction) to make sure there are no pinned
buffers for the rel --- if so, we want to elog() during the transaction
not after commit. We could also release any non-dirty buffers at
that point. Then after commit we know we don't care about the dirty
buffers anymore, so we come back and discard them.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message The Hermit Hacker 2000-10-28 18:17:38 Re: Gram.y patches for better parenthesis handling.
Previous Message Vadim Mikheev 2000-10-28 17:38:22 Re: Proposal for DROP TABLE rollback mechanism