Re: Minimally avoiding Transaction Wraparound in VLDBs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-08-31 23:24:35
Message-ID: 5588.1125530675@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> A new command is proposed - VACUUM MINIMAL. The *sole* purpose of this
> command is to do the absolute minimum required to avoid transaction id
> wraparound. (Better names welcome....)

I do not see the point. If you only need to run it every billion
transactions, saving a few cycles seems a bit pointless.

> This does the same thing as VACUUM except it:

> 1. does not VACUUM any table younger than 4 billion XIDs old

So? Your large tables are likely to be long-lived, so this isn't
actually going to save a thing in a DB that's been around long enough
to have an impending wrap problem.

> 2. does not VACUUM indexes
> RATIONALE: Indexes don't contain XIDs so don't *need* to be vacuumed.

Nonstarter. If you remove any tuples, you *must* vacuum indexes to
remove the corresponding index entries. If you don't remove any tuples,
you don't scan the indexes anyway IIRC.

> By taking those two steps, VACUUM MINIMAL will execute fairly quickly
> even on large Data Warehouses.

I don't think this'll save a thing.

> This command only makes sense when you *know* that tables don't need
> vacuuming. That is the case when:
> - you're using autovacuum, since all tables have dead-tuples removed
> whenever this is required - and so indexes will have been covered also

If you're using autovacuum then the problem is already taken care of.
It will be taken care of better in 8.2, if we add per-table tracking
of XID wraparound risk, but it's handled now. The only way that this
proposal makes any sense is if you are trying not to vacuum at all, ever.

> - you are using table partitioning and the data retention period of your
> data is less than 4 billion transactions.

Again, per-table tracking of wraparound horizon would take care of this
case, more elegantly and more safely.

> The limit is set at 4 billion because with this command we are trying to
> avoid doing work as long as possible.

You do realize that 2 billion is already the wrap horizon, and you can't
wait that long if you're doing this on a routine basis rather than
immediately-when-needed?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2005-09-01 00:06:40 Re: Indexing dead tuples
Previous Message Tom Lane 2005-08-31 23:06:25 Re: Indexing dead tuples