Re: [WIP] Add relminxid column to pg_class

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [WIP] Add relminxid column to pg_class
Date: 2006-04-10 00:04:41
Message-ID: 14984.1144627481@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I think a better idea is to have a separate "is read only" bit in
> pg_class. A table with that bit set doesn't need vacuuming at all, and
> needs not participate in the datminxid calculations.

I think that just complicates matters.

> One problem I see with this new bit is that a table can only have it set
> if it's correctly frozen; and how do we know if it is? We would need to
> set the relminxid to FrozenXid anyway :-(

Right.

> On further though: we can't do it on VACUUM FREEZE, because it's not a
> full vacuum and thus it doesn't have an exclusive lock on the table, so
> someone else could be modifying it. But what about creating a new
> VACUUM mode which would lock the table and set the read-only flag?

I was thinking it might be acceptable for VACUUM FREEZE to take
ExclusiveLock (not AccessExclusiveLock). That would still allow
concurrent readers, and it's not clear why you'd want to do VACUUM
FREEZE on a table that has active writers.

>> I remember we had decided against the idea of having the first
>> modification of a frozen table change its relminxid, but I've forgotten
>> what the rationale was ... do you remember?

> Actually in the end I decided not to explore that route further because
> I wasn't sure how to deal with it from WAL. If you freeze a table, then
> "unfreeze" it and the system crashes, how does the modification reach
> pg_class?

You certainly must emit a WAL entry for the action of unfreezing, but I
don't see any particular reason why that's a bad idea. Transitioning a
table between frozen and unfrozen states should be rare enough that
emitting a WAL entry for it is not a performance problem.

Here's a sketch of the idea as it's developing in my mind:

1. We consider that relminxid = FrozenXid means that the table is
frozen, ie is guaranteed to contain no valid XIDs except FrozenXid.
Otherwise, relminxid must be a lower bound on the non-frozen XIDs in
the table.

2. VACUUM FREEZE acquires ExclusiveLock, vacuums the table replacing
all XIDs with FrozenXid, and if successful sets relminxid = FrozenXid.
(It might not be successful, eg it might see recently-dead tuples it
can't remove; we can't replace their xmin/xmax obviously.) In all other
cases, VACUUM sets relminxid = Min(oldest unfrozen xid in table,
transaction xmin) (or some other convenient lower-bound computation,
eg maybe just use the cutoff instead of actively figuring the min XID).
I think we have to XLOG the setting of relminxid to be safe.

3. Any modification of a table (that inserts an XID into it) must check
to see if relminxid = FrozenXid, and if so change it to transaction xmin
(or some other lower bound on the oldest running XID). This action has
to be WAL-logged.

4. VACUUM has to recompute datminxid to be the oldest non-frozen
relminxid in the database (but not more than transaction xmin, to cover
case where someone else is creating a table concurrently). We might be
able to go back to your idea of not having to do this work unless the
prior value of relminxid matches datminxid. I think plain VACUUM could
skip tables having relminxid = FrozenXid altogether.

I'm tempted to say that the "unfreezing" action (#3) could just be done
at the point where we open a rel and take a stronger-than-AccessShare
lock on it. This would minimize the overhead needed, and give us pretty
good confidence we'd not missed any places. It'd mean that, say, an
UPDATE that changed no rows would still mark the table unfrozen, but I
see no great downside to that.

Comments?

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Alvaro Herrera 2006-04-10 00:36:03 Re: [WIP] Add relminxid column to pg_class
Previous Message Alvaro Herrera 2006-04-09 23:31:29 Re: [WIP] Add relminxid column to pg_class