Re: The vacuum-ignore-vacuum patch

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: The vacuum-ignore-vacuum patch
Date: 2006-07-28 17:38:41
Message-ID: 20060728173840.GS66525@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Fri, Jul 28, 2006 at 03:08:08AM +0300, Hannu Krosing wrote:
> > The other POV is that we don't really care about long-running
> > transaction in other databases unless they are lazy vacuum, a case which
> > is appropiately covered by the patch as it currently stands. This seems
> > to be the POV that Hannu takes: the only long-running transactions he
> > cares about are lazy vacuums.
>
> Yes. The original target audience of this patch are users running 24/7
> OLTP databases with big slow changing tables and small fast-changing
> tables which need to stay small even at the time when the big ones are
> vacuumed.
>
> The other possible transactions which _could_ possibly be ignored while
> VACUUMING are those from ANALYSE and non-lazy VACUUMs.

There are other transactions to consider: user transactions that will
run a long time, but only hit a limited number of relations. These are
as big a problem in an OLTP environment as vacuum is.

Rather than coming up with machinery that will special-case vacuum or
pg_dump, etc., I'd suggest thinking about a generic framework that would
work for any long-runnnig transaction. One possibility:

Transaction flags itself as 'long-running' and provides a list of
exactly what relations it will be touching.

That list is stored someplace a future vacuum can get at.

The transaction runs, with additional checks that ensure it will not
touch any relations that aren't in the list it provided.

Any vacuums that start will take into account these lists of relations
from long-running transactions and build a list of XIDs that have
provided a list, and the minimum XID for every relation that was listed.
If vacuum wants to vacuum a relation that has been listed as part of a
long-running transaction, it will use the oldest XID in the
database/cluster or the oldest XID listed for that relation, whichever
is older. If it wants to vacuum a relation that is not listed, it will
use the oldest XID in the database/cluster, excluding those XIDs that
have listed exactly what relations they will be looking at.

That scheme won't help pg_dump... in order to do so, you'd need to allow
transactions to drop relations from their list.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kenneth Marshall 2006-07-28 17:44:12 Re: Hash indexes (was: On-disk bitmap index patch)
Previous Message Jim C. Nasby 2006-07-28 17:17:14 Re: On-disk bitmap index patch

Browse pgsql-patches by date

  From Date Subject
Next Message Chris Browne 2006-07-28 18:25:23 Re: The vacuum-ignore-vacuum patch
Previous Message Jim C. Nasby 2006-07-28 17:11:48 Re: [HACKERS] Resurrecting per-page cleaner for btree