Skip site navigation (1) Skip section navigation (2)

Re: Hot Standby and VACUUM FULL

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hot Standby and VACUUM FULL
Date: 2010-02-01 15:06:17
Message-ID: 19652.1265036777@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> Once the updated map file is moved into place, the relocation is effectively
>> committed even if we subsequently abort the transaction.  We can make that
>> window pretty narrow but not remove it completely.

> We could include the instructions to update the map file in the commit
> record, instead of introducing a new record type, and update the map
> file only *after* writing the commit record. The map file doesn't grow,
> so we can be pretty confident that updating it doesn't fail (failure
> would lead to PANIC).

> I'm assuming the map file is fixed size, with a fixed location for each
> relation, so that we can just overwrite the old file without the
> create+rename dance, and not worry about torn-pages.

That seems too fragile to me, as I don't find it a stretch at all to
think that writing the map file might fail --- just think Windows
antivirus code :-(.  Now, once we have written the WAL record for
the mapfile change, we can't really afford a failure in my approach
either.  But I think a rename() after successfully creating/writing/
fsync'ing a temp file is a whole lot safer than writing from a standing
start.

The other problem with what you sketch is that it'd require holding the
mapfile write lock across commit, because we still have to have strict
serialization of updates.

[ thinks for awhile ... ]  OTOH, overwrite-in-place is what we've always
used for pg_control updates, and I don't recall ever seeing a report of
a problem that could be traced to that.  Maybe we should forget the
rename() trick and overwrite the map file in place.  I still think it
needs to be a separate WAL record though.  I'm thinking

	* obtain lock
	* open file for read/write
	* read current contents
	* construct modified contents
	* write and sync WAL record
	* write back file through already-opened descriptor
	* fsync
	* release lock

Not totally clear if this is more or less safe than the rename method;
but given the assumption that the file is less than one disk block,
it should be just as atomic as pg_control updates are.

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2010-02-01 15:07:49
Subject: Re: Hot Standby and VACUUM FULL
Previous:From: Greg StarkDate: 2010-02-01 14:45:32
Subject: Re: Deadlock in vacuum (check fails)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group