Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Postgresql-Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
Date: 2007-08-30 00:56:08
Message-ID: 46D615A8.6000600@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> "Florian G. Pflug" <fgp(at)phlo(dot)org> writes:
>>> One comment is that at the time we make an entry into smgr's
>>> pending-deletes list, I think we might not have acquired an XID yet
>
>> Hm.. I was just going to implement this, but I'm now wondering if
>> thats really worth it.
>
> Basically what you'd give up is the ability to Assert() that there are
> no deletable files if there's no XID, which seems to me to be an
> important cross-check ... although maybe making smgr do that turns
> this "cross-check" into a tautology ... hmm. I guess the case that's
> bothering me is where we reach commit with deletable files and no XID.
> But that should probably be an error condition anyway, ie, we should
> error out and turn it into an abort. On the abort side we'd consider
> it OK to have files and no XID. Seems reasonable to me.

I've done that now, and it turned out nicely. There is an Assertion
on "(nrels == 0) || xid assigned" in the COMMIT path, but
not in the ABORT path. Seems reasonable and safe.

And I'm quite tempted to not flush the XLOG at all during ABORT, and to
only force synchronous commits if one of the to-be-deleted files is
non-temporary. The last idea widens the leakage window quite a bit
though, so I maybe I should rather resist that temptation...

OTOH, it'd allow aynchronous commits for transactions that created
temporary tables.

> The only way we could make this more robust is if we could have
> WAL-before-data rule for file *creation*, but I think that's not
> possible given that we don't know what relfilenode number we will use
> until we've successfully created a file. So there will always be
> windows where a crash leaks unreferenced files. There's been some
> debate about having crash recovery search for and delete such files, but
> so far I've resisted it on the grounds that it sounds like data loss
> waiting to happen --- someday it'll delete a file you wished it'd kept.

It seems doable, but it's not pretty. One possible scheme would be to
emit a record *after* chosing a name but *before* creating the file,
and then a second record when the file is actually created successfully.

Then, during replay we could remember a list of xids and filenames,
and remove those files for which we either haven't seen a "created
successfully" record, or no COMMIT record for the creating xid.

With this scheme, I'd be natural to force XID assignment in smgrcreate,
because we'd actually depend on logging the xid there.

greetings, Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2007-08-30 01:03:47 Re: Why is there a tsquery data type?
Previous Message Tom Lane 2007-08-30 00:40:22 Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment