Re: autovacuum not prioritising for-wraparound tables

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum not prioritising for-wraparound tables
Date: 2013-01-25 00:18:32
Message-ID: 20130125001832.GH8539@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Alvaro,

Nice to see a patch on this!

On 2013-01-24 18:57:15 -0300, Alvaro Herrera wrote:
> I have a bug pending that autovacuum fails to give priority to
> for-wraparound tables. When xid consumption rate is high and dead tuple
> creation is also high, it is possible that some tables are waiting for
> for-wraparound vacuums that don't complete in time because the workers
> are busy processing other tables that have accumulated dead tuples; the
> system is then down because it's too near the Xid wraparound horizon.
> Apparently this is particularly notorious in connection with TOAST
> tables, because those are always put in the tables-to-process list after
> regular tables.
>
> (As far as I recall, this was already reported elsewhere, but so far I
> have been unable to find the discussion in the archives. Pointers
> appreciated.)
>
> So here's a small, backpatchable patch that sorts the list of tables to
> process (not all that much tested yet). Tables which have the
> wraparound flag set are processed before those that are not. Other
> than this criterion, the order is not defined.
>
> Now we could implement this differently, and maybe more simply (say by
> keeping two lists of tables to process, one with for-wraparound tables
> and one with the rest) but this way it is simpler to add additional
> sorting criteria later: say within each category we could first process
> smaller tables that have more dead tuples.

If I remember the issue that triggered this correctly I don't think this
would be sufficient to solve the whole issue although it sure would
delay the shutdown.
Due to the high activity on the system while some bigger, active table
got vacuumed, other previously vacuumed tables already hit
freeze_max_age again and thus they were reeligible for vacuum again even
though other tables - in our the specific case always toast relations
because they always got added last - were very short before the shutdown
limit.

So I think we need to sort by age(relfrozenxid) in tables that are over
the anti-wraparound limit. Given your code that doesn't seem to be that
hard?

I think after the infrastructure is there we might want to have some
more intelligence for non-wraparound tables too, but that possibly looks
more like a HEAD than a backpatch thing.

I am very much of the opinion that this needs to be backpatched though -
its a pretty bad thing if autovacuum cannot be relied on to keep a
system from shutting itself down because it always vacuums the wrong
relations and never gets to the problematic ones. Single user mode is
nothing normal users should ever have to see.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2013-01-25 00:55:17 Re: Strange Windows problem, lock_timeout test request
Previous Message Tom Lane 2013-01-25 00:11:48 Re: Clarification of certain SQLSTATE class