Re: [HACKERS] Per-table freeze limit proposal

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] Per-table freeze limit proposal
Date: 2005-11-15 02:40:52
Message-ID: 20051115024052.GA18588@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > Cool. I wonder if the exact figure should be
> > min(lowest non-frozen Xid in table, GetOldestXmin(false))
>
> Actually just min(lowest Xid in table, RecentXmin). You only need to be
> sure there are no running transactions older than what you put into the
> field; their xmins are not at issue.

Ok, patch attached. Two new columns in pg_class store two Xids: the
"relminxid" is the one in the equation above. The relvacuumxid is the
OldestXmin. Two columns in pg_database replace the previous two,
datminxid is the minimum of all relminxid in the database, and
datvacuumxid is the minimum of all relvacuumxid. (datfreezexid is no
more.)

So we can check the Xid wrap horizon using datminxid, and truncate clog
using datvacuumxid. (Actually I was going to post the patch yesterday
without the datvacuumxid/relvacuumxid part, when I noticed that I was
truncating clog on datminxid which seemed a bad idea.)

Additionally I made DROP TABLE invalidate datminxid and datvacuumxid
when the table with the minimum is dropped. New values for the
pg_database columns are calculated by scanning pg_class at the next
vacuum when any of them is invalid, or when the table which was holding
the minimum back is vacuumed. New values for the pg_class columns are
updated every vacuum, as appropiate.

The whole thing is pretty fragile is somebody manually updates a
catalog. But we tell people not to do that, so it should be their
fault, right?

I discovered one problem with the whole approach. Per this patch, we
only store normal Xids in relminxid/relvacuumxid. So if a table is
completely frozen, we will store RecentXmin. We do this because it
would actually be unsafe to store, say, FrozenXid: if another
transaction stores/changes a tuple while we are vacuuming it, the Frozen
mark wouldn't be correct and thus the table could be corrupted if a Xid
wraparound happens (which is why we use RecentXmin in the first place:
to cope with the possibility of someone else using the table during the
vacuum.)

The problem comes when this is done to template1, and it is copied to
another database after some millions of transactions have come and go --
it will seem like the database has suffered wraparound. We would need
to vacuum it completely after copied for the stats to be accurate.

I'm not sure what to do about that. I think storing FrozenXid may not
actually be a totally bad idea. Comments?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachment Content-Type Size
vacuum-minxid-2.patch text/plain 84.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2005-11-15 02:47:08 Re: PostgreSQL roadmap for 8.2 and beyond.
Previous Message Tom Lane 2005-11-15 02:34:19 Re: Long-time 7.4 contrib failure Mac OS X 10.3.8

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2005-11-15 03:07:49 Re: Interval aggregate regression failure (expected seems
Previous Message Andrew Dunstan 2005-11-14 17:06:30 Re: drop if exists