Re: autovacuum scheduling starvation and frenzy

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum scheduling starvation and frenzy
Date: 2014-05-15 23:06:29
Message-ID: CAMkU=1yZcZv-Yu-ySjXYpnBsSRi76_iJCPM_+adAAT_+24HbcA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 15, 2014 at 12:55 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com>wrote:

> Jeff Janes wrote:
>
> > If you have a database with a large table in it that has just passed
> > autovacuum_freeze_max_age, all future workers will be funnelled into that
> > database until the wrap-around completes. But only one of those workers
> > can actually vacuum the one table which is holding back the frozenxid.
> > Maybe the 2nd worker to come along will find other useful work to do, but
> > eventually all the vacuuming that needs doing is already in progress, and
> > so each worker starts up, gets directed to this database, finds it can't
> > help, and exits. So all other databases are entirely starved of
> > autovacuuming for the entire duration of the wrap-around vacuuming of
> this
> > one large table.
>
> Bah. Of course :-(
>
> Note that if you have two databases in danger of wraparound, the oldest
> will always be chosen until it's no longer in danger. Ignoring the
> second one past freeze_max_age seems bad also.
>

I'm not sure how bad that is. If you really do want to get the frozenxid
advanced as soon as possible, it makes sense to focus on one at a time,
rather than splitting the available IO throughput between two of them. So
I wouldn't go out of my way to enable two to run at the same time, nor go
out of my way to prevent it.

If most wrap around scans were done as part of a true emergency it would
make sense to forbid all other vacuums (but only if you also automatically
disabled autovacuum_vacuum_cost_delay as part of the emergency) so as not
to divide up the IO throughput. But most are not emergencies, as
200,000,000 is a long way from 2,000,000,000.

>
> This code is in autovacuum.c, do_start_worker(). Not sure what does
> your proposal look like in terms of code.

I wasn't sure either, I was mostly trying the analyze the situation. But I
decided just moving the "skipit" chunk of code to above the wrap-around
code might work for experimental purposes, as attached. It has been
running for a few of hours that way and I no longer see the frenzies
occurring whenever pgbench_history gets vacuumed..

But I can't figure out why we sometimes use adl_next_worker and sometimes
use last_autovac_time, which makes me question how much I really understand
this code.

> I think that instead of
> trying to get a single target database in that foreach loop, we could
> try to build a prioritized list (in-wraparound-danger first, then
> in-multixid-wraparound danger, then the one with the oldest autovac time
> of all the ones that remain); then recheck the wrap-around condition by
> seeing whether there are other workers in that database that started
> after the wraparound condition appeared.

I think we would want to check for one worker that is still running, and at
least one other worker that started and completed since the wraparound
threshold was exceeded. If there are multiple tables in the database that
need full scanning, it would make sense to have multiple workers. But if a
worker already started and finished without increasing the frozenxid and,
another attempt probably won't accomplish much either. But I have no idea
how to do that bookkeeping, or how much of an improvement it would be over
something simpler.

Cheers,

Jeff

Attachment Content-Type Size
vac_wrap_move.patch application/octet-stream 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2014-05-15 23:36:17 Re: buildfarm animals and 'snapshot too old'
Previous Message Peter Geoghegan 2014-05-15 22:50:40 Re: Error in running DBT2