Skip site navigation (1) Skip section navigation (2)

Re: Hot Standby and VACUUM FULL

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hot Standby and VACUUM FULL
Date: 2010-02-01 15:45:12
Message-ID: 20417.1265039112@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> The other problem with what you sketch is that it'd require holding the
>> mapfile write lock across commit, because we still have to have strict
>> serialization of updates.

> Why is the strict serialization of updates needed? To avoid overwriting
> the file with stale contents in a race condition?

Exactly.

> I was thinking that we only store the modified part in the WAL record.
> Right after writing commit record, take the lock, read() the map file,
> modify it in memory, write() it back, and release lock.

> That means that there's no full images of the file in WAL records, which
> makes me slightly uncomfortable from a disaster recovery point-of-view,

Yeah, me too, which is another advantage of using a separate WAL entry.

> That doesn't solve the problem I was trying to solve, which is that if
> the map file is rewritten, but the transaction later aborts, the map
> file update has hit the disk already. That's why I wanted to stash it
> into the commit record.

The design I sketched doesn't require such an assumption anyway.  Once
the map file is written, the relocation is effective, commit or no.
As long as we restrict relocations to maintenance operations such as
VACUUM FULL, which have no transactionally significant results, this
doesn't seem like a problem.  What we do need is that after a CLUSTER
or V.F., which is going to relocate not only the rel but its indexes,
the relocations of the rel and its indexes have to all "commit"
atomically.  But saving up the transaction's map changes and applying
them in one write takes care of that.

(What I believe this means is that pg_class and its indexes have to all
be mapped, but I'm thinking right now that no other non-shared catalogs
need the treatment.)

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Robert HaasDate: 2010-02-01 15:46:10
Subject: Re: Package namespace and Safe init cleanup for plperl UPDATE 3 [PATCH]
Previous:From: Robert HaasDate: 2010-02-01 15:43:36
Subject: pgsql: Augment EXPLAIN output with more details on Hash nodes.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group